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COMPOUNDS AND METHODS FOR ANALYZING THE PROTEOME 
RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. provisional application Serial 
No. 60/363,433, filed March 11, 2002, to Koster et aL, entitled 
"COMPOUNDS AND METHODS FOR ANALYZING THE PROTEOME." 
5 This application is also related to U.S. application No. (attorney docket 
no. 24743-2305C), filed March 11, 2003. This application is related to 
U.S. application Serial No. 10/197,954 and International PCT application 
No. PCT/US02/22821. 

For U.S. national stage purposes and where permitted, the 
10 disclosures of the above-referenced provisional patent application, PCT 
application and U.S. applications are incorporated herein by reference in 
their entirety. 
FIELD 

Provided herein are compounds and methods using the compounds 
15 to specifically and selectively analyze biomolecules. In particular, the 
compounds and methods are useful for analyzing the proteome. 
BACKGROUND 

The Human Genome effort has generated a raw sequence of the 3 
billion base pairs of the human genome and revealed about 35,000 genes. 
20 Genetic variations amongst different individuals and in between 

populations are being studied in order to determine the association with 
the predisposition to disease or the correlation to drug efficacy and/or 
side effects. 

A frequent manifestation of genetic variations are single nucleotide 
25 polymorphisms (SNPs). Technologies have been developed to analyze 
SNPs in an industrial scale [e.g., MassARRAY™ and the MassARRAY® 
system, Sequenom, Inc., San Diego, CA) and in pooled samples to study 
the frequency of SNPs in populations of various gender, ethnicity, age 
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and health condition. The ultimate goal of these efforts is to understand 
the etiology of disease on the molecular level {e.g., based on genetic 
variances (pharmacogenomics)), in order to develop diagnostic assays 
hand-in-hand with new and more effective and no-side-effect drugs. 
5 Knowledge of the association of an SNP {or SNPs) with a given 

disease or drug side-effect, however, is not sufficient. Further, it is not 
sufficient to establish the differential expression profiles of messenger 
RNAs in comparing a healthy and disease tissue sample as is performed 
today using expression DNA chips {e.g., GeneChip™ technology, 

10 Affymetrix, Inc., Santa Clara, CA; LifeArray" technology, Incyte 

Genomics, Inc., Palo Alto, CA). This is because the metabolic activities in 
a cell are not being carried out by mRNAs but rather by proteins which 
are translated from them and subsequently posttranslationally modified 
(alkylated, glycosylated, phosphorylated, etc.). 

15 The study of proteomics encompasses both the study of individual 

proteins and how these proteins function within a biochemical pathway. 
Proteomics also include the study of protein interactions with regard to 
how they form the architecture which make up living cells. In many 
human diseases such as cancer, Alzheimer's disease, diabetes as well as 

20 host responses to infectious diseases, the elucidation of the complex 
interactions regulatory proteins, which can cause diseases, is a critical 
step to finding effective treatment. Often, SNPs and other nucleic acid 
mutations occur in genes whose products are such proteins as (1) growth 
related hormones, (2) membrane receptors for growth hormones, (3) 

25 components of the trans-membrane signal pathway, (4) DNA binding 
proteins which act on transcription and the inactivation of suppressor 
genes {e.g. P53) causing the onset of disease. 

The only technology that is currently being used to analyze 
complex protein mixtures is 2D gel electrophoresis and subsequent image 
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processing to identify changes in the pattern (structural changes) or 
intensity of various protein spots. 2D gel electrophoresis is a laborious, 
error-prone method with low reproducibility and suffers from an inability 
to be automated. Further, the resolution of 2D gels is insufficient to 
5 display all the proteins present in a mixture. Therefore, the analysis of 
the proteome needs technologies which allow scaling up to industrial 
levels with the features of an industrial process: high accuracy, 
reproducibility and flexibility in which the process is of high-throughput, 
automatable and cost-effective. It is an object herein to provide such 

10 technologies. 
SUMMARY 

Gradient arrays for the analysis of biomolecules are provided. In 
particular, arrays and methods are provided for analyzing complex protein 
mixtures, such as the proteome. The arrays provide bifunctional reagents 

15 that allow for the separation and isolation of complex protein mixtures. 
Automated instruments for performing the methods are also provided. 

Provided herein are methods, arrays of capture compounds (also 
referred to herein as capture agents) for analysis of the proteome on an 
industrial level in a high throughput format. The methods and arrays 

20 permit sorting of complex mixtures of biomolecules. In addition, they 
permit identification of protein structures predicative or indicative of 
specific of phenotypes, such as disease states, thereby eliminating the 
need for random SNP analysis, expression profiling and protein analytical 
methods. The arrays and methods sort complex mixtures by providing a 

25 variety of different capture agents. In addition, they can be used to 
identify structural "epitopes" that serve as markers for specific disease 
states, stratify individual populations relative to specific phenotypes, 
permit a detailed understanding of the proteins underlying molecular 
function, and provide targets for drug development. The increased 
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understanding of target proteins permit the design of higher efficiency 
therapeutics. 

Arrays of capture compounds and methods that use the 
compounds, singly or in collections thereof, to capture, separate and 
5 analyze biomolecules, including, but not limited to, mixtures of 
biomolecules, including biopolymers and macromolecules, such as 
proteins, individual biomolecules, such as proteins, including individual or 
membrane proteins, are provided. The arrays contain a plurality, 
generally at least two, three, typically at least 10, 50, 100, 1000 or more 
10 different capture compounds, where different compounds are located at 
each locus. 

In particular, the arrays are gradient arrays, which are generally a 
2-dimensional addressable arrays of capture compounds. In these 
embodiments the capture compounds, which are defined herein, are those 

15 that contain or moieties X and optionally Y linked to a solid support 

(referred to as Z below). Hence the arrays are a 2-dimensional array of 
moieties X and optionally Y on a surface Z that presents moieties X and 
Y. The surface Z is a solid support; the X and Y moieties are arranged in 
a two dimensional array of continuous gradients of one or more properties 

20 of X and Y; the properties of moiety X are in a gradient along the X-axis; 
the properties of moiety Y are in a gradient along the Y-axis; the X 
moieties are each independently selected to bind to biomolecules 
covalently or with sufficiently high affinity so that the resulting complexes 
of biomolecule/capture compounds are stable under conditions of mass 

25 spectrometric analysis; and the Y moieties are each independently 
selected to increase the selectivity of the binding by moiety X. 

For example, each X moiety in each row (X-axis) of loci can differ 
in a gradual manner in hydrophilicity, lipophilicity, charge, size, reagent 
specificity or other such property. Each Y moiety, if present, in each 
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column (Y-axis) of loci can differ in a gradual manner, for example, in 
hydrophilicity, lipophilicity, charge, size, reagent specificity or other such 
property. In some embodiments, the Y moieties can be present at each 
locus with the X moieties or can be bound to each X moiety. 
5 For example, the X or Y moiety can be an azobenzene group and a 

gradient of hydrophilicity can be created by increasing (or decreasing) 
light exposure at each locus in the array. The X moiety can be a charged 
group and a gradient of charge can created by exposure to to in increase 
in current or voltage. Aother properties include increased or decreased 

10 specificity of the X or Y groups for NH 2/ SH, SS or OH groups. 

The resulting arrays present a surface with a plurality of loci (10, 
50, 100, 500, 1000, 2000, 3000, 4000, 5000, 10,000 and more) that 
differ in a particular properties or pairs of properties or a plurality of 
properties in defined increments. Such a surface permits capture of 

1 5 molecules and biological particules with differing affinities for the X 

moieties at discrete loci. These gradient arrays can be used in methods 
for sorting complex mixtures or probing cell and organelle surfaces and in 
other methods described herein or, for example, in copending U.S., 
application Serial No. 10/197,954 or International PCT application No. 

20 PCT/US02/22821. 

Thus the arrays are designed to permit probing of a mixture 
of biomolecules by virtue of interaction of the capture compounds in the 
collection with the components of the a mixture under conditions that 
preserve their three-dimensional configuration. Each locus in the array is 

25 designed 1 ) to bind, either covalently or other chemical interaction with 
high binding affinity (k a ) such that the binding is irreversible or stable 
under conditions of mass spectrometric analysis) to fewer than all, 
typically about 5 to 20 or more component biomolecules in a mixture, 
depending upon complexity and diversity of the mixtuer, under 
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physiological conditions, including hydrophobic conditions, and 

2) distinguish among biomolecules based upon topological features. 

The arrays are used in a variety of methods, but are particularly 
designed for assessing biomolecules, such as biopolymer, components in 
5 mixtures from biological samples. The arrays are used in top-down 
unbiased methods that assess structural changes, including post- 
translational structural changes and, for example, are used to compare 
patterns, particularly post-translational protein patterns, in diseased 
versus healthy cells from primarly cells generally from the same individual. 

10 The cells that serve as the sources of biomolecules can be frozen into a 
selected metabolic state or synchronized to permit direct comparison and 
identification of phenotype-specific, such as disease-specific 
biomolecules, generally proteins. 

A locus in the array includes at a chemical reactivity group X (also 

15 referrred to herein as a function or a functionality), which effects the 

covalent or a high binding affinity (high k a ) binding, and least one of three 
other groups (also referred to herein as functions or funtionalities). The 
loci also can include a selectivity function Y that modulates the 
interaction of a biomolecule with the reactivity function. 

20 For example, the reactivity group (reactivity function) includes 

groups that specifically react or interact with functionalities on the 
surface of a protein such as hydroxyl, amine, amide, sulfide and 
carboxylic acid groups, or that recognize specific surface areas, such as 
an antibody, a lectin or a receptor-specific ligand, or interacts with the 

25 active site of enzymes. Those skilled in the art can select from a library 
of functionalities to accomplish this interaction. While this interaction 
can be highly reaction-specific, these compounds can react multiple times 
within the same protein molecule depending on the number of surface- 
accessible functional groups. Modification of the reaction conditions 
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allows the identification of surface accessible functional groups with 
differing reactivity, thereby permitting identification of one or more highly 
reactive sites used to separate an individual protein from a mixture. 
Available technologies do not separate species in the resulting reaction 
5 mixture. The collections and compounds provided herein solve that 
prdoblem through a second functionality, the selectivity group, which 
alters binding the reactivity groups to the biomolecule. 

Selectivity functions include a variety of groups, as well as the 
geometric spacing of the second functionality, a single stranded 

10 unprotected or suitably protected oligonucleotide or oligonucleotide 

analog. The selective functionality can be separate from the compound 
and include the solid or semi-solid support. The selective functionality in 
this embodiment can be porosity, hydrophobicity, charge and other 
chemical properties of the material. For example, selectivity functions 

15 interact noncovalently with target proteins to alter the specificity or 

binding of the reactivity function. Such functions include chemical groups 
and biomolecules that can sterically hinder proteins of specific size, 
hydrophilic compounds or proteins (e.g., PEG and trityls), hydrophobic 
compounds or proteins (e.g., polar aromatic, lipids, glycolipids, 

20 phosphotriester, oligosaccharides), positive or negatively charged groups, 
groups or biomolecules which create defined secondary or tertiary 
structure. 

In practice in one embodiment, a gradient array is contacted with a 
biomolecule mixture and the bound molecules are assessed using, for 
25 example, mass spectrometry, followed by optional application of tagging, 
such as fluorescence tagging, after arraying to identify low abundance 
proteins. In other embodiments, a single capture compound is contacted 
with one or plurality of biomolecules, and the bound molecules are 
assessed. 
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Also provided herein are methods for the discovery and 
identification of proteins, which are selected based on a defined 
phenotype. The methods allow proteins to bind to the target molecules 
under physiological conditions while maintaining the correct secondary 
5 and tertiary conformation of the target. The methods can be performed 
under physiological and other conditions that permit discovery of 
bioglogically important proteins, including membrane proteins, that are 
selected based upon a defined phenotype. 

The gradient arrays can be composed of moieties selected to 

10 capture target proteins or groups related proteins that can mimic 

biological structures such as nuclear and mitochondrial transmembrane 
structures, artificial membranes or intact cell walls. 

Samples for analysis include any biomolecules, particularly protein- 
containing samples, such as protein mixtures, including, but not limited 

15 to, natural and synthetic sources. Proteins can be prepared by 
translation from isolated chromosomes, genes, cDNA and genomic 
libraries. Proteins can be isolated from cells, and other sources. In 
certain embodiments, the capture compounds provided herein are 
designed to selectively capture different post-translational modifications 

20 of the same protein (i.e., phosphorylation patterns {e.g., oncogenes), 
glycosylation and other post-translational modifications). 

Other methods that employ the collections are also provided. In 
one method, the arrays are used to distinguish between or among 
different conformations of a protein and, for example, can be used for 

25 phenotypic identification, such as for diagnosis.. For example, diseases 
of protein aggregation, which are disease involving a confonrnationally 
altered protein, such as amyloid diseases, the collections can distingush 
between the disease-involved form of the protein from the normal protein 
and thereby diagnose the disease in a sample. 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows the hybridization, separation and mass spectral 
analysis of a mixture of proteins. 

Figure 2 provides a schematic depiction of one embodiment of the 
5 apparatus provided herein. 

Figure 3 shows exemplary, non-limiting examples of dendrimeric 
structures upon which the compounds provided herein are based. 

Figure 4 illustrates masking synthesis of a chip possessing patches 
of varying hydrophobicity/hydrophilicity. 
10 Figure 5 shows an exemplary chip possessing a continuous two- 

dimensional gradient of charge (Y-axis) and hydrophilicity/hydrophobicity 
(X-axis). 

Figure 6 illustrates a light-tunable hydrophobic/hydrophilic diazo 
switch. 

15 Figure 7 shows a continuous two-dimensional substrate having a 

gradient of electric charges in one dimension and a gradient of 
hydrophobic/hydrophilic groups tunable by a light gradient in the other 
dimension. 

Figure 8 illustrates a protein tagged with four compounds provided 
20 herein, thereby allowing for specific sorting of the protein. 

Figure 9 shows the increased and specific hybridization resulting 
from use of two or more oligonucleotide tags. 

Figure 10 illustrates hybridizations possible with dedrimeric 
oligonucleotide-tagged proteins. 
25 Figure 1 1 shows tagging of a single protein with two different 

oligonucleotides in one reaction. 

Figure 12 shows various dedrimeric structures for Z. 

Figure 13 is a flow diagram of recombinant protein production. 

Figure 14 illustrates production of an adapted oligonucleotide dT 



WO 03/077851 



PCT/US03/07479 



-10- 

primed cDNA library. 

Figure 1 5 shows production of an adapted sequence motif specific 

cDNA library. 

Figure 16 shows production of an adapted gene specific cDNA. 
5 Figure 1 7 illustrates purification of amplification products from a 

template library. 

Figure 1 8 shows an adapted oligonucleotide dT primed cDNA 
libraray as a universal template for the amplification of gene 
subpopulations. 

10 Figure 19 illustrates decrease of complexity during PCR 

amplification. 

Figure 20 shows the attachment of a bifunctional molecule to a 
solid surface. 

Figure 21 shows analysis of purified proteins from compound 

15 screening and antibody production. 

DETAILED DESCRIPTION OF THE EMBODIMENTS 
A. Definitions 

Unless defined otherwise, all technical and scientific terms used 
herein have the same meaning as is commonly understood by one of skill 

20 in the art to which the invention (s) belong. All patents, patent 

applications, published applications and publications, Genbank sequences, 
websites and other published materials referred to throughout the entire 
disclosure herein, unless noted otherwise, are incorporated by reference 
in their entirety. In the event that there are a plurality of definitions for 

25 terms herein, those in this section prevail. Where reference is made to a 
URL or other such indentifier or address, it understood that such 
identifiers can change and particular information on the internet can come 
and go, but equivalent information can be found by searching the 
internet. Reference thereto evidences the availability and public 
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dissemination of such information. 

As used herein, an oligonucleotide means a linear sequence of up 
to about 20, about 50, or about 100, nucleotides joined by 
phosphodiester bonds- Above this length the term polynucleotide begins 
5 to be used. 

As used herein, an oligonucleotide analog means a linear sequence 
of up to about 20, about 50, or about 100, nucleotide analogs, or linear 
sequence of up to about 20, about 50, or about 100 nucleotides linked by 
a "backbone" bond other than a phosphodiester bond, for example, a 
10 phosphotriester bond, a phosphoramidate bond, a phophorothioate bond, 
a methylphosphonate diester bond, a thioester bond, or a peptide bond 

(peptide nucleic acid). 

As used herein, peptide nucleic acid (PNA) refers to nucleic acid 
analogs in that the ribose-phosphate backbone is replaced by a backbone 

15 held together by amide bonds. 

As used herein, proteome means all the proteins present within a 

cell. 

As used herein, a biomolecule is any compound found in nature, or 
derivatives thereof. Biomolecules include, but are not limited to 

20 oligonucleotides, oligonucleosides, proteins, peptides, amino acids, lipids, 
steroids, peptide nucleic acids (PNAs), oligosaccharides and 
monosaccharides. 

As used herein, MALDI-TOF refers to matrix assisted laser 
desorption ionization-time of flight mass spectrometry. 

25 As used herein, the term "conditioned" or "conditioning," when 

used in reference to a protein thereof, means that the polypeptide is 
modified to decrease the laser energy required to volatilize the protein, to 
minimize the likelihood of fragmentation of the protein, or to increase the 
resolution of a mass spectrum of the protein or of the component amino 
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acids. Resolution of a mass spectrum of a protein can be increased by 
conditioning the protein prior to performing mass spectrometry. 
Conditioning can be performed at any stage prior to mass spectrometry 
and, in one embodiment, is performed while the protein is immobilized. A 
5 protein can be conditioned, for example, by treating it with a cation 
exchange material or an anion exchange material, which can reduce the 
charge heterogeneity of the protein, thereby for eliminating peak 
broadening due to heterogeneity in the number of cations (or anions) 
bound to the various proteins a population. In one embodiment, removal 

10 of all cations by ion exchange, except for H + and ammonium ions, is 
performed. Contacting a polypeptide with an alkylating agent such as 
alkyliodide, iodoacetamide, iodoethanol, or 2,3-epoxy-1-propanol, the 
formation of disulfide bonds, for example, in a proteins can be prevented. 
Likewise, charged amino acid side chains can be converted to uncharged 

15 derivatives employing trialkylsilyl chlorides. 

Since the capture compounds contain protein and nucleic acid 
portions, conditioning suitable for one or both portions is also 
contemplated. Hence, a prepurification to enrich the biomolecules to be 
analyzed and the removal of all cations, such as by ion exchange, except 

20 for H + and ammonium, or other conditioning treatment to improve 

resolution is advantageous for analysis of the nucleic acid portion as well 

as the protein portion. 

Conditioning of proteins is generally unnecessary because proteins 
are relatively stable under acidic, high energy conditions so that proteins 
25 do not require conditioning for mass spectrometric analyses. There are 
means of improving resolution, however, in one embodiment for shorter 
peptides, such as by incorporating modified amino acids that are more 
basic than the corresponding unmodified residues. Such modification in 
general increases the stability of the polypeptide during mass 
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spectrometric analysis. Also, cation exchange chromatography, as well 
as general washing and purification procedures that remove proteins and 
other reaction mixture components away from the protein can be used to 
increase the resolution of the spectrum resulting from mass spectrometric 
5 analysis of the protein. 

As used herein, "matrix" refers to the material with which the 
capture compound biomolecule conjugates are combined for MALDI mass 
spectrometric analysis. Any matrix material, such as solid acids, 
including 3-hydroxypicolinic acid, liquid matrices, such as glycerol, known 

10 to those of skill in the art for nucleic acid and/or protein analyses is 
contemplated. Since the compound biomolecule conjugates contain 
nucleic acid and protein a mixture (optimal for nucleic acids and proteins) 
of matrix molecules can be used. 

As used herein, macromolecule refers to any molecule having a 

15 molecular weight from the hundreds up to the millions. Macromolecules 
include, but are not limited to, peptides, proteins, nucleotides, nucleic 
acids, carbohydrates, and other such molecules that are generally 
synthesized by biological organisms, but can be prepared synthetically or 
using recombinant molecular biology methods. / 

20 As used herein, the term "biopolymer" is refers to a biological 

molecule, including macromolecules, composed of two or more 
monomeric subunits, or derivatives thereof, which are linked by a bond or 
a macromolecule. A biopolymer can be, for example, a polynucleotide, a 
polypeptide, a carbohydrate, or a lipid, or derivatives or combinations 

25 thereof, for example, a nucleic acid molecule containing a peptide nucleic 
acid portion or a glycoprotein. The methods and collections herein, 
though described with reference to biopolymers, can be adapted for use 
with other synthetic schemes and assays, such as organic syntheses of 
pharmaceuticals, or inorganics and any other reaction or assay performed 
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on a solid support or in a well in nanoliter or smaller volumes. 

As used herein, biomolecule includes biopolymers and 
macromolecules and all molecules that can be isolated from living 
organisms and viruses, including, but are not limited to, cells, tissues, 
5 prions, animals, plants, viruses, bacteria and other organsims. 

As used herein, a biological particle refers to a virus, such as a viral 
vector or viral capsid with or without packaged nucleic acid, phage, inclu- 
ding a phage vector or phage capsid, with or without encapsulated nucleo- 
tide acid, a single cell, including eukaryotic and prokaryotic cells or 

10 fragments thereof, a liposome or micellar agent or other packaging 
particle, and other such biological materials. For purposes herein, 
biological particles include molecules that are not typically considered 
macromolecules because they are not generally synthesized, but are 
derived from cells and viruses. 

15 As used herein, a drug refers to any compound that is a candidate 

for use as a therapeutic or as lead compound for designing a therapeutic 
or that is a known pharmaceutical. Such compounds can be small 
molecules, including small organic molecules, peptides, peptide mimetics, 
antisense molecules, antibodies, fragments of antibodies, recombinant 

20 antibodies. Of particular interest are "drugs" that have specific binding 
properties so that they can be used as selectivity groups or can be used 
as for sorting of the capture compounds, either a sorting functionality that 
binds to a target on a support, or linked to a solid support, where the 
sorting functionality is the drug target. 

25 As used herein, the term "nucleic acid" refers to single-stranded 

and/or double-stranded polynucleotides such as deoxyribonucleic acid 
(DNA), and ribonucleic acid (RNA) as well as analogs or derivatives of 
either RNA or DNA. A nucleic acid molecules are linear polymers of 
nucleotides, linked by 3', 5' phosphodiester linkages. In DNA, 
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deoxyribonucleic acid, the sugar group is deoxyribose and the bases of 
the nucleotides are adenine, guanine, thymine and cytosine. RNA, 
ribonucleic acid, has ribose as the sugar and uracil replaces thymine. 
Also included in the term ''nucleic acid" are analogs of nucleic acids such 
5 as peptide nucleic acid (PNA), phosphorothioate DNA, and other such 
analogs and derivatives or combinations thereof. 

As used herein, the term "polynucleotide" refers to an oligomer or 
polymer containing at least two linked nucleotides or nucleotide 
derivatives, including a deoxyribonucleic acid (DNA), a ribonucleic acid 

10 (RNA), and a DNA or RNA derivative containing, for example, a nucleotide 
analog or a "backbone" bond other than a phosphodiester bond, for 
example, a phosphotriester bond, a phosphoramidate bond, a 
methylphosphonate diester bond, a phophorothioate bond, a thioester 
bond, or a peptide bond (peptide nucleic acid). The term 

15 "oligonucleotide" also is used herein essentially synonymously with 
"polynucleotide," although those in the art recognize that 
oligonucleotides, for example, PCR primers, generally are less than about 
fifty to one hundred nucleotides in length. 

Nucleotide analogs contained in a polynucleotide can be, for 

20 example, mass modified nucleotides, which allows for mass 

differentiation of polynucleotides; nucleotides containing a detectable 
label such as a fluorescent, radioactive, colorometric, luminescent or 
chemiluminescent label, which allows for detection of a polynucleotide; or 
nucleotides containing a reactive group such as biotin or a thiol group, 

25 which facilitates immobilization of a polynucleotide to a solid support. A 
polynucleotide also can contain one or more backbone bonds that are 
selectively cleavable, for example, chemically, enzymatically or 
photolytically. For example, a polynucleotide can include one or more 
deoxyribonucleotides, followed by one or more ribonucleotides, which can 
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be followed by one or more deoxyribonucleotides, such a sequence being 
cleavable at the ribonucleotide sequence by base hydrolysis. A 
polynucleotide also can contain one or more bonds that are relatively 
resistant to cleavage, for example, a chimeric oligonucleotide primer, 
5 which can include nucleotides linked by peptide nucleic acid bonds and at 
least one nucleotide at the 3' end, which is linked by a phosphodiester 
bond, or the like, and is capable of being extended by a polymerase. 
Peptide nucleic acid sequences can be prepared using well known 
methods (see, for example, Weiler et al. (1997) Nucleic acids Res. 

10 25:2792-2799). 

A polynucleotide can be a portion of a larger nucleic acid molecule, 
for example, a portion of a gene, which can contain a polymorphic region, 
or a portion of an extragenic region of a chromosome, for example, a 
portion of a region of nucleotide repeats such as a short tandem repeat 

15 (STR) locus, a variable number of tandem repeats (VNTR) locus, a 

microsatellite locus or a minisatellite locus. A polynucleotide also can be 
single stranded or double stranded, including, for example, a DNA-RNA 
hybrid, or can be triple stranded or four stranded. Where the 
polynucleotide is double stranded DNA, it can be in an A, B, L or Z 

20 configuration, and a single polynucleotide can contain combinations of 
such configurations. 

As used herein, a "mass modification" with respect to a 
biomolecule to be analyzed for mass spectrometry, refers to the inclusion 
of changes in consituent atoms or groups that change the molecule 

25 weight of the resulting molecule in defined increments detectable by mass 
spectrometic analysis. Mass modifications do not radiolabels, such as 
isotope labels or or fluroescent gropus or other such tags normally used 
for detection by means other than mass spectrometry. 

As used herein, the term "polypeptide," means at least two amino 
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acids, or amino acid derivatives, including mass modified amino acids and 
amino acid analogs, which are linked by a peptide bond and which can be 
a modified peptide bond. A polypeptide can be translated from a poly- 
nucleotide, which can include at least a portion of a coding sequence, or 
5 a portion of a nucleotide sequence that is not naturally translated due, for 
example, to it being located in a reading frame other than a coding frame, 
or it being an intron sequence, a 3' or 5' untranslated sequence, a 
regulatory sequence such as a promoter. A polypeptide also can be 
chemically synthesized and can be modified by chemical or enzymatic 

10 methods following translation or chemical synthesis. The terms 

"polypeptide," "peptide" and "protein" are used essentially synonymously 
herein, although the skilled artisan recognizes that peptides generally 
contain fewer than about fifty to one hundred amino acid residues, and 
that proteins often are obtained from a natural source and can contain, 

15 for example, post-translational modifications. A polypeptide can be 
post-translationally modified by, for example, phosphorylation 
(phosphoproteins), glycosylation (glycoproteins, proteoglycans), which 
can be performed in a cell or in a reaction in vitro. 

As used herein, the term "conjugated'' refers stable attachment, 

20 typically by virtue of a chemical interaction, including ionic and/or 

covalent attachment. Among the conjugation means are streptavidin- or 
avidin- to biotin interaction; hydrophobic interaction; magnetic interaction 
(e.g., using functionalized magnetic beads, such as DYNABEADS, which 
are streptavidin-coated magnetic beads sold by Dynal, Inc. Great Neck, 

25 NY and Oslo Norway); polar interactions, such as "wetting" associations 
between two polar surfaces or between oligo/polyethylene glycol; 
formation of a covalent bond, such as an amide bond, disulfide bond, 
thioether bond, or via crosslinking agents; and via an acid-labile or 
photocleavable linker. 
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As used herein, "sample" refers to a composition containing a 
material to be detected. For purposes sample refers to anything which 
can contain an biomolecule. The sample can be a biological sample, such 
as a biological fluid or a biological tissue obtained from any organism or a 
5 cell of or from an organism or a viral particle or portions thereof. 

Examples of biological fluids include urine, blood, plasma, serum, saliva, 
semen, stool, sputum, cerebral spinal fluid, tears, mucus, sperm, amniotic 
fluid or the like. Biological tissues are aggregate of cells, usually of a 
particular kind together with their intercellular substance that form one of 

10 the structural materials of a human, animal, plant, bacterial, fungal or viral 
structure, including connective, epithelium, muscle and nerve tissues. 
Examples of biological tissues also include organs, tumors, lymph nodes, 
arteries and individual cell(s). 

Thus, samples include biological samples (e.g., any material 

15 obtained from a source originating from a living being (e.g., human, 

animal, plant, bacteria, fungi, protist, virus). The biological sample can be 
in any form, including solid materials (e.g., tissue, cell pellets and 
biopsies, tissues from cadavers) and biological fluids (e.g., urine, blood, 
saliva, amniotic fluid and mouth wash (containing buccal cells)). In 

20 certain embodiments, solid materials are mixed with a fluid. In 

embodiments herein, the a sample for mass spectrometric analysis 
includes samples that contain a mixture of matrix used for mass 
spectrometric analyses and the capture compound/biomolecule 
complexss. 

25 As used herein, the term "solid support" means a non-gaseous, 

non-liquid material having a surface. Thus, a solid support can be a flat 
surface constructed, for example, of glass, silicon, metal, plastic or a 
composite; or can be in the form of a bead such as a silica gel, a 
controlled pore glass, a magnetic or cellulose bead; or can be a pin. 
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including an array of pins suitable for combinatorial synthesis or analysis. 

As used herein, a collection refers to combination of two or more 
members, generally 3, 5, 10, 50, 100, 500, 1000 or more members. In 
particular a collection refers to such combination of the capture 
5 compounds as provided herein. 

As used herein, an array refers to a collection of elements, such as 
the capture compounds, containing three or more members. An 
addressable array is one in that the members of the array are identifiable, 
typically by position on a solid phase support but also by virtue of an 

10 identifier or detectable label. Hence, in general the members of an array 
are be immobilized to discrete identifiable loci on the surface of a solid 
phase. A plurality of of the compounds are attached to a support, such 
as an array {i.e., a pattern of two or more) on the surface of a support, 
such as a silicon chip or other surface, generally through binding of the 

15 sorting functionality with a group or compound on the surface of the 
support. Addressing can be achieved by labeling each each member 
electronically, such as with an radio-frequency (RF) tag, through the use 
of color coded beads or other such identifiable and color coded labels and 
through molecular weight. Hence, in general the members of the array 

20 are immobilized to discrete identifiable loci on the surface of a solid phase 
or directly or indirectly linked to or otherwise associated with the 
identifiable label, such as affixed to a microsphere or other particulate 
support (herein referred to as beads) and suspended in solution or spread 
out on a surface. 

25 As used herein, a gradient array refers to an array of moieties, such 

as X and Y, such that properties of members of the array, gradually 
change along each axis. X is a reactivity group or function as defined 
herein and Y is a selectivity function or group as defined herein. Selected 
properties of each X and Y are modified in a predetermined manner. 
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As used herein, "substrate" refers to an insoluble support onto 
that a sample and/or matrix is deposited. Support can be fabricated from 
virtually any insoluble or solid material. For example, silica gel, glass 
(e.g., controlled-pore glass (CPG)), nylon, Wang resin, Merrifield resin, 
5 dextran cross— linked with epichlorohydrin (e.g., Sephadex R ), agarose 

(e.g., Sepharose R ), cellulose, magnetic beads, Dynabeads, a metal surface 
(e.g., steel, gold, silver, aluminum, silicon and copper), a plastic material 
{e.g., polyethylene, polypropylene, polyamide, polyester, polyvinylidene- 
difluoride (PVDF)} Exemplary substrate include, but are not limited to, 

10 beads (e.g., silica gel, controlled pore glass, magnetic, dextran 

cross— linked with epichlorohydrin (e.g., Sephadex R ), agarose (e.g., 
Sepharose R ), cellulose), capillaries, flat supports such as glass fiber filters, 
glass surfaces, metal surfaces (steel, gold, silver, aluminum, copper and 
silicon), plastic materials including multiwell plates or membranes (e.g., of 

15 polyethylene, polypropylene, polyamide, polyvinylidenedifluoride), pins 
(e.g., arrays of pins suitable for combinatorial synthesis or analysis or 
beads in pits of flat surfaces such as wafers (e.g., silicon wafers) with or 
without filter plates. The solid support is in any desired form, including, 
but not limited to, a bead, capillary, plate, membrane, wafer, comb, pin, a 

20 wafer with pits, an array of pits or nanoliter wells and other geometries 
and forms known to those of skill in the art. Supports include flat 
surfaces designed to receive or link samples at discrete loci. In one 
embodiment, flat surfaces include those with hydrophobic regions 
surrounding hydrophilic loci for receiving, containing or binding a sample. 

25 The supports can be particulate or can be in the form of a 

continuous surface, such as a microtiter dish or well, a glass slide, a 
silicon chip, a nitrocellulose sheet, nylon mesh, or other such materials. 
When particulate, typically the particles have at least one dimension in 
the 5-10 mm range or smaller. Such particles, referred collectively herein 
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as "beads", are often, but not necessarily, spherical. Reference to 
"bead," however, does not constrain the geometry of the matrix, which 
can be any shape, including random shapes, needles, fibers, and 
elongated. "Beads", particularly microspheres that are sufficiently small 
5 to be used in the liquid phase, are also contemplated. The "beads" can 
include additional components, such as magnetic or paramagnetic 
particles (see, e.g.,, Dyna beads (Dynal, Oslo, Norway)) for separation 
using magnets, as long as the additional components do not interfere with 
the methods and analyses herein. 

10 As used herein, "polymorphism" refers to the coexistence of more 

than one form of a gene or portion thereof. A portion of a gene of that 
there are at least two different forms, e.g., two different nucleotide 
sequences, is referred to as a "polymorphic region of a gene". A 
polymorphic region can be a single nucleotide, e.g., a single nucleotide 

15 polymorphism (SNP), the identity of that differs in different alleles. A 
polymorphic region also can be several nucleotides in length. 

As used herein, "polymorphic gene" refers to a gene having at least 
one polymorphic region. 

As used herein, "allele", which is used interchangeably herein with 

20 "allelic variant" refers to alternative forms of a gene or portions thereof. 
Alleles occupy the same locus or position on homologous chromosomes. 
When a subject has two identical alleles of a gene, the subject is said to 
be homozygous for the gene or allele. When a subject has two different 
alleles of a gene, the subject is said to be heterozygous for the gene. 

25 Alleles of a specific gene can differ from each other in a single nucleotide, 
or several nucleotides, and can include substitutions, deletions, and 
insertions of nucleotides. An allele of a gene also can be a form of a gene 
containing a mutation. 

As used herein, "predominant allele" refers to an allele that is 
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represented in the greatest frequency for a given population. The allele or 
alleles that are present in lesser frequency are referred to as allelic 
variants. 

As used herein, "associated" refers to coincidence with the 
5 development or manifestation of a disease, condition or phenotype. 
Association can be due to, but is not limited to, genes responsible for 
housekeeping functions whose alteration can provide the foundation for a 
variety of diseases and conditions, those that are part of a pathway that 
is involved in a specific disease, condition or phenotype and those that 
10 indirectly contribute to the manifestation of a disease, condition or 
phenotype. 

As used herein, the term "subject" refers to a living organism, such 
as a mammal, a plant, a fungi, an invertebrate, a fish, an insect, a 
pathogenic organism, such as a virus or a bacterium, and, includes 
15 humans and other mammals. 

As used herein, the term "gene" or "recombinant gene" refers to a 
nucleic acid molecule containing an open reading frame and including at 
least one exon and (optionally) an intron sequence. A gene can be either 
RNA or DNA. Genes can include regions preceding and following the 

20 coding region. 

As used herein, "intron" refers to a DNA fragment present in a 

given gene that is spliced out during mRNA maturation. 

As used herein, "nucleotide sequence complementary to the 

nucleotide sequence set forth in SEQ ID NO: x" refers to the nucleotide 

25 sequence of the complementary strand of a nucleic acid strand having 

SEQ ID NO: x. The term "complementary strand" is used herein 

interchangeably with the term "complement". The complement of a 

nucleic acid strand can be the complement of a coding strand or the 

complement of a non-coding strand. When referring to double stranded 
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nucleic acids, the complement of a nucleic acid having SEQ ID NO: x 
refers to the complementary strand of the strand having SEQ ID NO: x or 
to any nucleic acid having the nucleotide sequence of the complementary 
strand of SEQ ID NO: x. When referring to a single stranded nucleic acid 
5 having the nucleotide sequence SEQ ID NO: x # the complement of this 
nucleic acid is a nucleic acid having a nucleotide sequence that is 
complementary to that of SEQ ID NO: x. 

As used herein, the term "coding sequence" refers to that portion 
of a gene that encodes a amino acids that constitute a polypeptide or 
10 protein. 

As used herein, the term "sense strand" refers to that strand of a 
double-stranded nucleic acid molecule that has the sequence of the 
mRNA that encodes the amino acid sequence encoded by the double- 
stranded nucleic acid molecule. 

15 As used herein, the term "antisense strand" refers to that strand of 

a double-stranded nucleic acid molecule that is the complement of the 
sequence of the mRNA that encodes the amino acid sequence encoded 
by the double-stranded nucleic acid molecule. 

As used herein, the amino acids, which occur in the various amino 

20 acid sequences appearing herein, are identified according to their well- 
known, three-letter or one-letter abbreviations. The nucleotides, which 
occur in the various DNA fragments, are designated with the standard 
single-letter designations used routinely in the art (see, Table 1). 

As used herein, amino acid residue refers to an amino acid formed 

25 upon chemical digestion (hydrolysis) of a polypeptide at its peptide 
linkages. The amino acid residues described herein are, in certain 
embodiments, in the "L" isomeric form. Residues in the "D" isomeric 
form can be substituted for any L-amino acid residue, as long as the a 
desired functional property is retained by the polypeptide. NH 2 refers to 
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10 



15 



20 



the free amino group present at the amino terminus of a polypeptide. 
COOH refers to the free carboxy group present at the carboxyl terminus 
of a polypeptide. In keeping with standard polypeptide nomenclature 
described in J. Biol. Chem., 243:3552-59 (1969) and adopted at 37 
C.F.R. § § 1 .821 - 1 .822, abbreviations for amino acid residues are 
shown in the following Table: 

Table 1 

Table of Correspondence 



25 



SYMBOL 




1 -Letter 


3-Letter 


AMINO ACID 


Y 


Tyr 

■ 


tyrosine 


G 


Gly 


glycine 


F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


I 


lie 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


threonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


His 


histidine 


Q 


Gin 


glutamine 


E 


Glu 


glutamic acid 


Z 


Glx 


Glu and/or Gin 


W 


Trp 


tryptophan 


R 


Arg 


arginine 
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SYMBOL 




D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


B 


Asx 


Asn and/or Asp 


C 


Cys 


cysteine 


X 


Xaa 


Unknown or other 



It should be noted that all amino acid residue sequences 
represented herein by formulae have a left to right orientation in the 
conventional direction of amino-terminus to carboxyl-terminus. In 

10 addition, the phrase "amino acid residue" is broadly defined to include the 
amino acids listed in the Table of Correspondence and modified and 
unusual amino acids, such as those referred to in 37 C.F.R. § § 1 .821- 
1 .822, and incorporated herein by reference. Furthermore, it should be 
noted that a dash at the beginning or end of an amino acid residue 

15 sequence indicates a peptide bond to a further sequence of one or more 
amino acid residues or to an amino-terminal group such as NH 2 or to a 
carboxyl-terminal group such as COOH. 

In a peptide or protein, suitable conservative substitutions of amino 
acids are known to those of skill in this art and can be made generally 

20 without altering the biological activity of the resulting molecule. Those of 
skill in this art recognize that, in general, single amino acid substitutions 
in non-essential regions of a polypeptide do not substantially alter 
biological activity (see, e.g., Watson et al. Molecular Biology of the Gene, 
4th Edition, 1987, The Benjamin/Cummings Pub. co., p. 224). 

25 Such substitutions can be made in accordance with those set forth 

in TABLE 2 as follows: 
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TABLE 2 





Ala (A) 


Gly; Ser 




Arg (R) 


Lys 


5 


Asn (N) 


Gin; His 




Asp (D) 


Glu 




Cys (C) 


Ser 




Gin (Q) 


Asn 




Glu (E) 


Asp 


10 


Gly (G) 


Ala; Pro 




His (H) 


Asn; Gin 




He (I) 


Leu; Val 




Leu (L) 


lie; Val 




Lys (K) 


Arg; Gin 


15 


Met (M) 


Leu; Tyr; lie 




Phe (F) 


Met; Leu; Tyr 




Ser (S) 


Thr 




Thr (T) 


Ser 




Trp (W) 


Tyr 


20 


Tyr (Y) 


Trp; Phe 




Val (V) 


lie; Leu 



Other substitutions are also permissible and can be determined empirically 
or in accord with known conservative substitutions. 

As used herein, a DNA or nucleic acid homolog refers to a nucleic 

25 acid that includes a preselected conserved nucleotide sequence, such as a 
sequence encoding a therapeutic polypeptide. By the term "substantially 
homologous" is meant having at least 80%, at least 90%, at least 95% 
homology therewith or a less percentage of homology or identity and 
conserved biological activity or function. 

30 The terms "homology" and "identity" are often used 

interchangeably. In this regard, percent homology or identity can be 
determined, for example, by comparing sequence information using a GAP 
computer program. The GAP program uses the alignment method of 
Needleman and Wunsch U MoL Biol. 48:443 (1970), as revised by Smith 

35 and Waterman (Adv. AppL Math. 2:482 (1981). Briefly, the GAP program 
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defines similarity as the number of aligned symbols {e.g., nucleotides or 
amino acids) that are similar, divided by the total number of symbols in 
the shorter of the two sequences. The default parameters for the GAP 
program can include: (1) a unary comparison matrix (containing a value of 
5 1 for identities and 0 for non-identities) and the weighted comparison 
matrix of Gribskov and Burgess, NucL Acids Res. 14:6745 (1986), as 
described by Schwartz and Dayhoff, eds., ATLAS OF PROTEIN 
SEQUENCE AND STRUCTURE, National Biomedical Research Foundation, 
pp. 353-358 (1979); (2) a penalty of 3.0 for each gap and an additional 
10 0.10 penalty for each symbol in each gap; and (3) no penalty for 
end gaps. 

Whether any two nucleic acid molecules have nucleotide sequences 
that are at least 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% 
"identical" can be determined using known computer algorithms such as 

15 the "FAST A" program, using for example, the default parameters as in 
Pearson and Lipman, Proc. Natl. Acad. Sci. USA 55:2444 (1988). 
Alternatively the BLAST function of the National Center for Biotechnology 
Information database can be used to determine identity 

In general, sequences are aligned so that the highest order match 

20 is obtained. "Identity" per se has an art-recognized meaning and can be 
calculated using published techniques. (See, e.g.: Computational 
Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York, 
1988; Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, 

25 Part I, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 
1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic 
Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, 
J., eds., M Stockton Press, New York, 1991). While there exist a number 
of methods to measure identity between two polynucleotide or 
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polypeptide sequences, the term "identity" is well known to skilled 
artisans (Carillo, H. & Lipton, D., SIAM J Applied Math 4S:1073 (1988)). 
Methods commonly employed to determine identity or similarity between 
two sequences include, but are not limited to, those disclosed in Guide to 
5 Huge Computers, Martin J. Bishop, ed., Academic Press, San Diego, 
1994, and Carillo, H. & Lipton, D., SIAM J Applied Math 43:1073 
(1988). Methods to determine identity and similarity are codified in 
computer programs. Computer program methods to determine identity 
and similarity between two sequences include, but are not limited to, 

10 GCG program package (Devereux, J., et al., Nucleic Acids Research 
72f/):387 (1984)), BLASTP, BLASTN, FASTA (Atschul, S.F., et aL, J 
Molec Biol 275:403 (1 990)). 

Therefore, as used herein, the term "identity" represents a 
comparison between a test and a reference polypeptide or polynucleotide. 

15 For example, a test polypeptide can be defined as any polypeptide that is 
90% or more identical to a reference polypeptide. 

As used herein, the term at least "90% identical to" refers to 
percent identities from 90 to 99.99 relative to the reference polypeptides. 
Identity at a level of 90% or more is indicative of the fact that, assuming 

20 for exemplification purposes a test and reference polypeptide length of 
100 amino acids are compared. No more than 10% {e.g., 10 out of 100) 
amino acids in the test polypeptide differs from that of the reference 
polypeptides. Similar comparisons can be made between a test and 
reference polynucleotides. Such differences can be represented as point 

25 mutations randomly distributed over the entire length of an amino acid 
sequence or they can be clustered in one or more locations of varying 
length up to the maximum allowable, e.g., 10/100 amino acid difference 
(approximately 90% identity). Differences are defined as nucleic acid or 
amino acid substitutions, or deletions. 
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As used herein: stringency of hybridization in determining 
percentage mismatch is as follows: 

1) high stringency: 0.1 x SSPE, 0.1% SDS, 65°C 

2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 
5 3) low stringency: 1 .0 x SSPE, 0.1 % SDS, 50° C 

Those of skill in this art know that the washing step selects for 
stable hybrids and also know the ingredients of SSPE (see, e.g., 
Sambrook, E.F. Fritsch, T. Maniatis, in: Molecular Cloning, A Laboratory 
Manual, Cold Spring Harbor Laboratory Press (1989), vol. 3, p. B.13, see, 

10 also, numerous catalogs that describe commonly used laboratory 

solutions). SSPE is pH 7.4 phosphate- buffered 0.18 NaCI. Further, 
those of skill in the art recognize that the stability of hybrids is determined 
by T m , which is a function of the sodium ion concentration and 
temperature (T m = 81.5° CM 6.6(log 10 [Na + ]) + 0.41 (%G + O-600/D), so 

15 that the only parameters in the wash conditions critical to hybrid stability 
are sodium ion concentration in the SSPE (or SSC) and temperature. 

It is understood that equivalent stringencies can be achieved using 
alternative buffers, salts and temperatures. By way of example and not 
limitation, procedures using conditions of low stringency are as follows 

20 (see also Shilo and Weinberg, Proc. Natl. Acad. Sci. USA 78:6789-6792 
(1981)): Filters containing DNA are pretreated for 6 hours at 40° C in a 
solution containing 35% formamide, 5X SSC, 50 mM Tris-HCI (pH 7.5), 
5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500//g/ml denatured 
salmon sperm DNA (10X SSC is 1.5 M sodium chloride, and 0.15 M 

25 sodium citrate, adjusted to a pH of 7). 

Hybridizations are carried out in the same solution with the 
following modifications: 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 
//g/ml salmon sperm DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10 6 
cpm 32 P-labeled probe is used. Filters are incubated in hybridization 
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mixture for 18-20 hours at 40°C / and then washed for 1.5 hours at 55°C 
in a solution containing 2X SSC, 25 mM Tris-HCI (pH 7.4), 5 mM EDTA, 
and 0.1% SDS. The wash solution is replaced with fresh solution and 
incubated an additional 1.5 hours at 60°C. Filters are blotted dry and 
5 exposed for autoradiography. If necessary, filters are washed for a third 
time at 65-68 °C and reexposed to film. Other conditions of low 
stringency which can be used are well known in the art (e.g., as 
employed for cross-species hybridizations). 

By way of example and not way of limitation, procedures using 
10 conditions of moderate stringency include, for example, but are not 

limited to, procedures using such conditions of moderate stringency are 
as follows: Filters containing DNA are pretreated for 6 hours at 55 °C in a 
solution containing 6X SSC, 5X Denhart's solution, 0.5% SDS and 100 
//g/ml denatured salmon sperm DNA. Hybridizations are carried out in the 
5 same solution and 5-20 X 10 6 cpm 32 P-labeled probe is used. Filters are 
incubated in hybridization mixture for 18-20 hours at 55°C, and then 
washed twice for 30 minutes at 60°C in a solution containing 1X SSC 

and 0.1% SDS. Filters are blotted dry and exposed for autoradiography. 

i 

Other conditions of moderate stringency which can be used are well- 
0 known in the art. Washing of filters is done at 37 °C for 1 hour in a 
solution containing 2X SSC, 0.1% SDS. 

By way of example and not way of limitation, procedures using 
conditions of high stringency are as follows: Prehybridization of filters 
containing DNA is carried out for 8 hours to overnight at 65°C in buffer 
>5 composed of 6X SSC, 50 mM Tris-HCI (pH 7.5), 1 mM EDTA, 0.02% 
PVP, 0.02% Ficoll, 0.02% BSA, and 500 //g/ml denatured salmon sperm 
DNA. Filters are hybridized for 48 hours at 65°C in prehybridization 
mixture containing 1 00 //g/ml denatured salmon sperm DNA and 5-20 X 
10 6 cpm of 32 P-labeled probe. Washing of filters is done at 37°C for 
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1 hour in a solution containing 2X SSC, 0.01% PVP, 0.01% Ficoll, and 
0.01 % BSA. This is followed by a wash in 0.1 X SSC at 50°C for 45 
minutes before autoradiography. Other conditions of high stringency 
which can be used are well known in the art. 
5 The term substantially identical or substantially homologous or 

similar varies with the context as understood by those skilled in the 
relevant art and generally means at least 60% or 70%, preferably means 
at least 80%, 85% or more preferably at least 90%, and most preferably 
at least 95% identity. 

10 It is to be understood that the compounds provided herein can 

contain chiral centers. Such chiral centers can be of either the (R) or (S) 
configuration, or can be a mixture thereof. Thus, the compounds 
provided herein can be enantiomerically pure, or be stereoisomeric or 
diastereomeric mixtures. In the case of amino acid residues, such 

15 residues can be of either the L- or D-form. In one embodiment, the 
configuration for naturally occurring amino acid residues is L. 

As used herein, substantially pure means sufficiently homogeneous 
to appear free of readily detectable impurities as determined by standard 
methods of analysis, such as thin layer chromatography (TLC), gel 

20 electrophoresis, high performance liquid chromatography (HPLC) and 

mass spectrometry (MS), used by those of skill in the art to assess such 
purity, or sufficiently pure such that further purification would not 
detectably alter the physical and chemical properties, such as enzymatic 
and biological activities, of the substance. Methods for purification of the 

25 compounds to produce substantially chemically pure compounds are 
known to those of skill in the art. A substantially chemically pure 
compound can, however, be a mixture of stereoisomers. In such 
instances, further purification might increase the specific activity of the 
compound. 
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As used herein, a cleavable bond or moiety refers to a bond or 
moiety that is cleaved or cleavable under the specific conditions, such as 
chemically, enzymatically or photolytically. Where not specified herein, 
such bond is cleavable under conditions of MALDI-MS analysis, such as 
5 by a UV or IR laser. 

As used herein, a "selectively cleavable" moiety is a moiety that 
can be selectively cleaved without affecting or altering the composition of 
the other portions of the compound of interest. For example, a cleavable 
moiety L of the compounds provided herein is one that can be cleaved by 

10 chemical, enzymatic, photolytic, or other means without affecting or 
altering composition {e.g., the chemical composition) of the conjugated 
biomolecule, including a protein. "Non-cleavable" moieties are those that 
cannot be selectively cleaved without affecting or altering the 
composition of the other portions of the compound of interest. 

15 As used herein, binding with high affinity refers to a binding that 

as an association constant k a of at least 10 9 and generally 10 10 , 10 11 
liters/mole or greater) or a K eq of 10 9 , 10 10 , 10 11 , 10 12 or greater. For 
purposes herein, high affinity bonds formed by the reactivity groups are 
those that are stable to the laser (UV and IR) used in MALDI-MS analyses. 

20 As used herein, "alkyl", "alkenyt" and "alkynyl", if not specified, 

contain from 1 to 20 carbons, or 1 to 16 carbons, and are straight or 
branched carbon chains. Alkenyl carbon chains are from 2 to 20 carbons, 
and, in certain embodiments, contain 1 to 8 double bonds. Alkenyl 
carbon chains of 1 to 1 6 carbons, in certain embodiments, contain 1 to 5 

25 double bonds. Alkynyl carbon chains are from 2 to 20 carbons, and, in 
one embodiment, contain 1 to 8 triple bonds. Alkynyl carbon chains of 2 
to 16 carbons, in certain embodiments, contain 1 to 5 triple bonds. 
Exemplary alkyl, alkenyl and alkynyl groups include, but are not limited to, 
methyl, ethyl, propyl, isopropyl, isobutyl, n-butyl, sec-butyl, tert-butyl, 
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isopentyl, neopentyl, tert-penytyl and isohexyl. The alkyl, alkenyl and 
alkynyl groups, unless otherwise specified, can be optionally substituted / 
with one or more groups, including alkyl group substituents that can be 

the same or different. 
5 As used herein, "lower alkyl", "lower alkenyl", and "lower alkynyl" 

refer to carbon chains having less than about 6 carbons. 

As used herein, "alk(en)(yn)yl" refers to an alkyl group containing 
at least one double bond and at least one triple bond. 

As used herein, an "alkyl group substituent" includes, but is not 
10 limited to, halo, haloalkyl, including halo lower alkyl, aryl, hydroxy, 
alkoxy, aryloxy, alkyloxy, alkylthio, arylthio, aralkyloxy, aralkylthio, 
carboxy alkoxycarbonyl, oxo and cycloalkyl. 

As used herein, "aryl" refers to aromatic groups containing from 5 
to 20 carbon atoms and can be a mono-, multicyclic or fused ring system. 
15 Aryl groups include, but are not limited to, phenyl, naphthyl, biphenyl, 

fluorenyl and others that can be unsubstituted or are substituted with one 
or more substituents. 

As used herein, "aryl" also refers to aryl-containing groups, 
including, but not limited to, aryloxy, arylthio, arylcarbonyl and arylamino 
20 groups. 

As used herein, an "aryl group substituent" includes, but is not 
limited to, alkyl, alkenyl, alkynyl, cycloalkyl, cycloalkylalkyl, aryl, 
heteroaryl optionally substituted with 1 or more, including 1 to 3, 
substituents selected from halo, halo alkyl and alkyl, aralkyl, 
25 heteroaralkyl, alkenyl containing 1 to 2 double bonds, alkynyl containing 
1 to 2 triple bonds, alk(en)(yn)yl groups, halo, pseudohalo, cyano, 
hydroxy, haloalkyl and polyhaloalkyl, including halo lower alkyl, especially 
trifluoromethyl, formyl, alkylcarbonyl, arylcarbonyl that is optionally 
substituted with 1 or more, including 1 to 3, substituents selected from 
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halo, halo alkyl and alkyl, heteroarylcarbonyl, carboxy, alkoxycarbonyl, 
aryloxycarbonyl, aminocarbonyl, alkylaminocarbonyl, dialkylaminocar- 
bonyl, arylaminocarbonyl, diarylaminocarbonyl, aralkylaminocarbonyl, 
alkoxy, aryloxy, perfluoroaikoxy, alkenyloxy, alkynyloxy, arylalkoxy, 
5 aminoalkyl, aikylaminoalkyl, dialkylaminoalkyl, arylaminoalkyl, amino, 
alkylamino, dialkylamino, arylamino, alkylarylamino, alkylcarbonylamino, 
arylcarbonylamino, azido, nitro, mercapto, alkylthio, arylthio, 
perfluoroalkylthio, thiocyano, isothiocyano, alkylsulfinyl, alkylsulfonyl, 
arylsulfinyl, arylsulfonyl, aminosulfonyl, alkylaminosulfonyl r 

10 dialkylaminosulfonyl and arylaminosulfonyl. 

As used herein, "aralkyl" refers to an alkyl group in that one of the 
hydrogen atoms of the alkyl is replaced by an aryl group. 

As used herein, "heteroaralkyl" refers to an alkyl group in that one 
of the hydrogen atoms of the alkyl is replaced by a heteroaryl group. 

15 As used herein, "cycloalkyl" refers to a saturated mono- or multi- 

cyclic ring system, in one embodiment, of 3 to 10 carbon atoms, or 3 to 
6 carbon atoms; cycloalkenyl and cycloalkynyl refer to mono- or 
multicyclic ring systems that respectively include at least one double bond 
and at least one triple bond. Cycloalkenyl and cycloalkynyl groups can 

20 contain, in one embodiment, 3 to 10 carbon atoms, with cycloalkenyl 
groups, in other embodiments, containing 4 to 7 carbon atoms and 
cycloalkynyl groups, in other embodiments, containing 8 to 10 carbon 
atoms. The ring systems of the cycloalkyl, cycloalkenyl and cycloalkynyl 
groups can be composed of one ring or two or more rings that can be 

25 joined together in a fused, bridged or spiro-connected fashion, and can be 
optionally substituted with one or more alkyl group substituents. 
"Cycloalk(en)(yn)yr refers to a cycloalkyl group containing at least one 
double bond and at least one triple bond. 

As used herein, "heteroaryl" refers to a monocyclic or multicyclic 
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ring system, in one embodiment of about 5 to about 15 members where 
one or more, or 1 to 3, of the atoms in the ring system is a heteroatom, 
which is, an element other than carbon, for example, nitrogen, oxygen 
and sulfur atoms. The heteroaryl can be optionally substituted with one 
5 or more, including 1 to 3, aryl group substituents. The heteroaryl group 
can be optionally fused to a benzene ring. Exemplary heteroaryl groups 
include, but are not limited to, pyrroles, porphyrines, furans, thiophenes, 
selenophenes, pyrazoles, imidazoles, triazoles, tetrazoles, oxazoles, 
oxadiazoles, thiazoles, thiadiazoles, indoles, carbazoles, benzofurans, 

10 benzothiophenes, indazoles, benzimidazoles, benzotriazoles, 

benzoxatriazoles, benzothiazoles, benzoselenozoles, benzothiadiazoles, 
benzoselenadiazoles, purines, pyridines, pyridazines, pyrimidines, 
pyrazines, pyrazines, triazines, quinolines, acridines, isoquinolines, 
cinnolines, phthalazines, quinazolines, quinoxalines, phenazines, 

15 phenanthrolines, imidazinyl, pyrrolidinyl, pyrimidinyl, tetrazolyl, thienyl, 
pyridyl, pyrrolyl, N-methylpyrrolyl, quinolinyl and isoquinolinyl. 

As used herein, "heteroaryl" also refers to heteroaryl-containing 
groups, including, but not limited to, heteroaryloxy, heteroarylthio, 
heteroarylcarbonyl and heteroarylamino. 

20 As used herein, "heterocyclic" refers to a monocyclic or multicyclic 

ring system, in one embodiment of 3 to 10 members, in another 
embodiment 4 to 7 members, including 5 to 6 members, where one or 
more, including 1 to 3 of the atoms in the ring system is a heteroatom, 
which is, an element other than carbon, for example, nitrogen, oxygen 

25 and sulfur atoms. The heterocycle can be optionally substituted with one 
or more, or 1 to 3 aryl group substituents. In certain embodiments, 
substituents of the heterocyclic group include hydroxy, amino, alkoxy 
containing 1 to 4 carbon atoms, halo lower alkyl, including trihalomethyl, 
such as trifluoromethyl, and halogen. As used herein, the term 
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heterocycle can include reference to heteroaryl. 

As used herein, the nomenclature alkyl, alkoxy, carbonyl, etc., are 
used as is generally understood by those of skill in this art. For example, 
as used herein alkyl refers to saturated carbon chains that contain one or 
5 more carbons; the chains can be straight or branched or include cyclic 

portions or be cyclic. 

Where the number of any given substituent is not specified (e.g., 

"hatoalkyl"), there can be one or more substituents present. For example, 

"haloalkyl" can include one or more of the same or different halogens. As 
10 another example, "C^alkoxyphenyl" can include one or more of the same 

or different alkoxy groups containing one, two or three carbons. 
Where named substituents such as carboxy or substituents 

represented by variables such as W are separately enclosed in 

parentheses, yet possess no subscript outside the parentheses indicating 
15 numerical value and that follow substituents not in parentheses, e.g., "C v 

4 alkyl(W) (carboxy)", "W" and "carboxy" are each directly attached to 

* 

4 alkyl. 

As used herein, "halogen" or "halide" refers to F, CI, Br or I. 
As used herein, pseudohalides are compounds that behave 

20 substantially similar to halides. Such compounds can be used in the same 
manner and treated in the same manner as halides (X~, in that X is a 
halogen, such as CI or Br). Pseudohalides include, but are not limited to, 
cyanide, cyanate, isocyanate, thiocyanate, isothiocyanate, selenocyanate, 
trifluoromethoxy, and azide. 

25 As used herein, "haloalkyl" refers to a lower alkyl radical in that 

one or more of the hydrogen atoms are replaced by halogen including, but 
not limited to, chloromethyl, trifluoromethyl, 1-chIoro-2-fIuoroethyl and 
the like. 

As used herein, "haloalkoxy" refers to RO- in that R is a haloalkyl 
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group. 

As used herein, "sulfinyl" or "thionyl" refers to -S(O)-. As used 
herein, "sulfonyl" or "sulfuryl" refers to -S(0) 2 -. As used herein, "sulfo" 
refers to -S(0) 2 0-. 

5 As used herein, "carboxy" refers to a divalent radical, -C(0)0. 

As used herein, "aminocarbonyl" refers to -C(0)NH 2 . 
As used herein, "alkylaminocarbonyl" refers to -C(0)NHR in that R 
is hydrogen or alkyl, including lower alkyl. 

As used herein "dialkylaminocarbonyl" as used herein refers to 
10 -C(0)NR'R in that R' and R are independently selected from hydrogen or 
alkyl, including lower alkyl. 

As used herein, "carboxamide" refers to groups of formula 

-NRCOR. 

As used herein, "diarylaminocarbonyr refers to -C(0)NRR' in that R 
15 and R' are independently selected from aryl, including lower aryl, such as 
phenyl. 

As used herein, "aralkylaminocarbonyr refers to -C(0)NRR' in that 
one of R and R' is aryl, including lower aryl, such as phenyl, and the other 
of R and R' is alkyl, including lower alkyl. 
20 As used herein, "arylaminocarbonyi" refers to -C(0)NHR in that R is 

aryl, including lower aryl, such as phenyl. 

As used herein, "alkoxycarbonyi" refers to -C(0)OR in that R is 
alkyl, including lower alkyl. 

As used herein, "aryloxycarbonyr refers to -C(0)OR in that R is 

25 aryl, including lower aryl/ such as phenyl. 

As used herein, "alkoxy" and "alkylthio" refer to RO- and RS-, in 
that R is alkyl, including lower alkyl. 

As used herein, "aryloxy" and "arylthio" refer to RO- and RS-, in 
that R is aryl, including lower aryl, such as phenyl. 
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As used herein, "alkylene" refers to a straight, branched or cyclic, 
in one embodiment straight or branched, divalent aliphatic hydrocarbon 
group, in certain embodiments having from 1 to about 20 carbon atoms, 
in other embodiments 1 to 1 2 carbons, including lower alkylene. The 
5 alkylene group is optionally substituted with one or more "alkyl group 
substituents." There can be optionally inserted along the alkylene group 
one or more oxygen, sulphur or substituted or unsubstituted nitrogen 
atoms, where the nitrogen substituent is alkyl as previously described. 
Exemplary alkylene groups include methylene (-CH 2 -), ethylene 

10 (-CH 2 CH 2 -), propylene (-(CH 2 ) 3 -), cyclohexylene (-C 6 H 10 -), 

methylenedioxy {-0-CH 2 -0-) and ethylenedioxy (-0-(CH 2 ) 2 -0-). The term 
"lower alkylene" refers to alkylene groups having 1 to 6 carbons. In 
certain embodiments, alkylene groups are lower alkylene, including 
alkylene of 1 to 3 carbon atoms. 

15 As used herein, "alkenylene" refers to a straight, branched or ' 

cyclic, in one embodiment straight or branched, divalent aliphatic 
hydrocarbon group, in certain embodiments having from 2 to about 20 
carbon atoms and at least one double bond, in other embodiments 1 to 
12 carbons, including lower alkenylene. The alkenylene group is 

20 optionally substituted with one or more "alkyl group substituents." There 
can be optionally inserted along the alkenylene group one or more 
oxygen, sulphur or substituted or unsubstituted nitrogen atoms, where 
the nitrogen substituent is alkyl as previously described. Exemplary 
alkenylene groups include — CH = CH — CH = CH— and -CH = CH-CH 2 -. 

25 The term "lower alkenylene" refers to alkenylene groups having 2 to 6 
carbons. In certain embodiments, alkenylene groups are lower 
alkenylene, including alkenylene of 3 to 4 carbon atoms. 

As used herein, "alkynylene" refers to a straight, branched or 
cyclic, in one embodiment straight or branched, divalent aliphatic 
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hydrocarbon group, in certain embodiments having from 2 to about 20 
carbon atoms and at least one triple bond, in other embodiments 1 to 1 2 
carbons, including lower alkynylene. The alkynylene group is optionally 
substituted with one or more "alkyl group substituents." There can be 

» 

5 optionally inserted along the alkynylene group one or more oxygen, 
sulphur or substituted or unsubstituted nitrogen atoms, where the 
nitrogen substituent is alkyl as previously described. Exemplary 
alkynylene groups include — C = C— C = C— , -C = C- and -CsC-CH 2 -. The 
term "lower alkynylene" refers to alkynylene groups having 2 to 6 

10 carbons. In certain embodiments, alkynylene groups are lower 
alkynylene, including alkynylene of 3 to 4 carbon atoms. 

As used herein, "alk(en)(yn)ylene" refers to a straight, branched or 
cyclic, in one embodiment straight or branched, divalent aliphatic, 
hydrocarbon group, in certain embodiments having from 2 to about 20 

15 carbon atoms and at least one triple bond, and at least one double bond; 
in other embodiments 1 to 12 carbons, including lower alk(en)(yn)ylene. 
The alk(en)(yn)ylene group is optionally substituted with one or more 
"alkyl group substituents." There can be optionally inserted along the 
alkynylene group one or more oxygen, sulphur or substituted or 

20 unsubstituted nitrogen atoms, where the nitrogen substituent is alkyl as 
previously described. Exemplary alk(en)(yn)ylene groups include 
— C = C — (CH 2 ) n -C = C— , where n is 1 or 2. The term "lower 
alk(en)(yn)ylene" refers to alk(en)(yn)ylene groups having up to 6 
carbons. In certain embodiments, alk(en)(yn)ylene groups are lower 

25 alk(en)(yn)ylene, including alk(en)(yn)ylene of 4 carbon atoms. 

As used herein, "arylene" refers to a monocyclic or polycyclic, in 
one embodiment monocyclic, divalent aromatic group, in certain 
embodiments having from 5 to about 20 carbon atoms and at least one 
aromatic ring, in other embodiments 5 to 12 carbons, including lower 
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arylene- The arylene group is optionally substituted with one or more 
"alkyl group substituents." There can be optionally inserted around the 
arylene group one or more oxygen, sulphur or substituted or 
unsubstituted nitrogen atoms, where the nitrogen substituent is alkyl as 
5 previously described. Exemplary arylene groups include 1,2-, 1,3- and 
1,4-phenylene. The term "lower arylene" refers to arylene groups having 
5 or 6 carbons. In certain embodiments, arylene groups are lower 
arylene. 

As used herein, "heteroarylene" refers to a divalent monocyclic or 
10 multicyclic ring system, in one embodiment of about 5 to about 15 

members where one or more, or 1 to 3 of the atoms in the ring system is 
a heteroatom, which is, an element other than carbon, for example, 
nitrogen, oxygen and sulfur atoms. The heteroarylene group can be 
optionally substituted with one or more, or 1 to 3, aryl group 

15 substituents. 

As used herein, "alkylidene" refers to a divalent group, such as 
= CR'R", which is attached to one atom of another group, forming a 
double bond. Exemplary alkylidene groups are methylidene ( = CH 2 ) and 
ethylidene ( = CHCH 3 ). As used herein, "aralkylidene" refers to an 

20 alkylidene group in that either R' or R" is and aryl group. 

As used herein, "amido" refers to the divalent group -C(0)NH-. 
"Thioamido" refers to the divalent group -C(S)NH-. "Oxyamido" refers to 
the divalent group -OC(0)NH-. "Thiaamido" refers to the divalent group 
-SC{0)NH-. "Dithiaamido" refers to the divalent group -SC(S)NH-. 

25 "Ureido" refers to the divalent group -HNC(0)NH-. "Thioureido" refers to 
the divalent group -HNC(S)NH-. 

As used herein, "semicarbazide" refers to -NHC{0)NHNH-. 
"Carbazate" refers to the divalent group -OC(0)NHNH-. 
"Isothiocarbazate" refers to the divalent group -SC(0)NHNH-. 



WO 03/077851 



PCT/US03/07479 



-41- 

"Thiocarbazate" refers to the divalent group -OC(S)NHNH-. 
"Sulfonylhydrazide" refers to the group -S0 2 NHNH-. "Hydrazide" refers 
to the divalent group -C(0)NHNH-. "Azo" refers to the divalent group 
-N = N-. "Hydrazinyl" refers to the divalent group -NH-NH-. 
5 As used herein, the term "amino acid" refers to a-amino acids that 

are racemic, or of either the D- or L-configuration. The designation "d" 
preceding an amino acid designation (e.g., dAla, dSer, dVal, etc.) refers 
to the D-isomer of the amino acid. The designation "dl" preceding an 
amino acid designation {e.g., dIAIa) refers to a mixture of the L- and D- 
10 isomers of the amino acid. 

As used herein, when any particular group, such as phenyl or 
pyridyl, is specified, this means that the group is unsubstituted or is 
substituted. Substituents where not specified are halo, halo lower alkyl, 
and lower alkyl. 

15 As used herein, conformationally altered protein disease (or a 

disease of protein aggregation) refers to diseases associated with a 
protein or polypeptide that has a disease-associated conformation. The 
methods and collections provided herein permit detection of a conformer 
associated with a disease to be detected. Diseases and associated 

20 proteins that exhibit two or more different conformations in which at least 
one conformation is a conformationally altered protein, include, but are 
not limited to amyloid diseases and other neurodegenerative diseases 
known to those of skill in the art and set forth below. 

As used herein, cell sorting refers to an assay in which cells are 

25 separated and recovered from suspension based upon properties 

measured in flow cytometry analysis. Most assays used for analysis can 
serve as the basis for sorting experiments, as long as gates and regions 
defining the subpopulation(s) to be sorted do not logically overlap. 
Maximum throughput rates are typically 5000 cells/second (18 x 106 
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cells/hour). The rate of collection of the separated population(s) depends 
primarily upon the condition of the cells and the percentage of reactivity. 

As used herein, the abbreviations for any protective groups, amino 
acids and other compounds, are, unless indicated otherwise, in accord 
5 with their common usage, recognized abbreviations, or the IUPAC-IUB 
Commission on Biochemical Nomenclature (see, Biochem. 1972, 7/:942). 
For example, DMF = /V,/V-dimethylformamide, DMAc = /V,/V-dimethyl- 
acetamide; THF = tetrahydrofuran; TRIS = tris(hydroxymethyl)amino- 
methane; SSPE = saline-sodium phosphate-EDTA buffer; EDTA = 

10 ethylenediaminetetraacetic acid; SDS = sodium dodecyl sulfate. 
B. Arrays of compounds 

Gradient arrays of capture compounds that selectively bind to 
biomolecules in samples, such as biomolecules, particurlarly, although not 
exclusively, a cell lysate or in vitro translated polypeptides from a cell 

15 lysate are provided. Each locus in the array can bind to specific groups 
or classes of biolopolymers, and is designed to covalently or tightly 
(sufficient to sustain mass spectrmetric analysis, for example) to a subset 
of all of the biomolecules in the sample. For example, a sample can 
contain 1000's of members, for example a cell lysate. The arrays of 

20 compounds permit sufficient selectivity so that, for example, about 10-20 
of the components of the sample bind to each member of the collection. 
The exact number is a small enough number for routine analyses to 
identify them, generally in one step, such as by mass spectrometry. 

The arrays permit a top down holistic approach to analysis of the 

25 proteome, including post-translationally modified proteins, and other 
biomolecules. Protein and other biomolecule patterns are the starting 
point for analyses that use these arrayss; rather than nucleic acids and 
the genome (bottom up). The arrays can be used to assess the 
biomolecule components of a sample, such as a biological sample, to 
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identify components specific to a particular phenotype, such as a disease 
state, to identify structural function, biochemical pathways and 
mechanisms of action. The arrays and methods of use permit an 
unbiased analysis of biomolecules, since the methods do not necessarily 
5 assess specific classes of targets, instead, changes in samples are 

detected or identified. The arrays permit the components of a complex 
mixture of biomolecules (i.e., a mixture of 50, 100, 500, 1000, 2000 and 
more) to be sorted into discrete loci containing reduced numbers, typically 
by 10%, 50% or greater reduction in complexity, or to about 1 to 50 

10 different biomolecules per locus in an array, so that the components at 
each spot can be analyzed, such as by mass spectrometric analysis alone 
or in combination with other analyses. In some embodiments, such as for 
phenotypic analyses, homogeneity of the starting sample, such as cells, 
can be important. To provide homogeneity, cells, with different 

15 phenotypes, such as diseased versus healthy, from the same individual 
are compared. Methods for doing so are provided herein. 

By virtue of the structure of moieties at the loci in the arrays, the 
arrays can be used to detect structural changes, such as those from the 
post-translational processing of proteins, and can be used to detect 

20 changes in membrane proteins, which are involved in the most 

fundamental processes, such as, signal transduction, ion channels, 
recetpros for ligand interaction and cell-to-cell interactions. When cells 
become diseased, changes associated with disease, such as 
transformation, often occur in membrane proteins. 

25 The arrays contain sets of moieties at each locus. The moieties at 

each locus include an X and optionally a Y group. In general members of 
each set differs differ in at least one property group, and generally in two 
or three or more from each other locus. As provided herein, the 
differences comprise a gradient, typically a two-dimensional gradient 
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based upon gradual changes in properties of each X and Y at each locus 
and accross the X and Y axes in the array. 

In practicing methods, the arrays are contacted with a sample or 
partially purified or purified components thereof to effect binding of 
5 biomolecules to capture compounds in the array. The resulting array is 
optionally treated with a reagent that specifically cleaves the bound 
polymers, such as a protease, and is subjected to analysis, particularly 
mass spectrometric analysis to identify components of the bound 
biomolecules at each locus. Once a molecular weight of a biomolecule, 

10 such as a protein or portion thereof of interest is determined, the 
biomolecule can be identified. Methods for identification include 
comparison of the molecular weights with databases, for example protein 
databases that iclude protease fragments and their molecular weights. 
The X and Y groups are functional groups that confer reactivity, 

1 5 selectivity and separative properties, depending on the specificity of 

separation and analysis required (which depends on the complexity of the 
mixture to be analyzed). In general, the loci in the arrays include at least 
two functional groups (functions) selected from: a reactivity function (X), 
which binds to biolopolymers either covalently or with a high k a (generally 

20 greater than about 10 9 , 10 10 , 10 12 liters/mole and/or such that the binding 
is substantially irreversible or stable under conditions of mass 
spectrometric analyses, such as MALDI-MS conditions); and a selectivity 
function (Y), which by virtue of non-covalent interactions alters, generally 
increases, the specificity of the reactivity function. 

25 In general, the reactivity function that is reative group that 

specifically interacts, tyically covalently or with high binding affinity (k a ), 
with particular biomolecules, such as proteinsm, or portions thereof,; and 
the other functionality/ the selectivity functions, alters, typically 
increasing, the specificity of the reactivity function. In general, the 
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reactive function covalently interacts with groups on a particular 
biomolecule, such amine groups on the surface of a protein. The 
reactivity function interacts with biomolecules to form a covalent bond or 
a non-covalent bond that is stable under conditions of analyis, generally 

5 with a k a of greater than 10 9 liters/mole or greater than 10 10 liters/mole. 
Conditions of analysis include, but are not limited to, mass 
spectrophotometric analysis, such as matrix assisted laser desorption 
ionization-time of flight (MALDI-TOF) mass spectrometry. The selectivity 
function influences the types of biomolecules that can interact with the 

O reactivity function through a non-covalent interaction. The selectivity 
function alters the specificity for the particular groups, generally reducing 
the number of such groups with which the reactivity functions reacts. A 
goal is to reduce the the number of proteins or biomolecules bound at a 
locus, so that the proteins which can then be separated, such as by mass 

5 spectrometry. 

Included among the moieties as provided herein are those that can 
be classified in at least two sets: one for reactions in aqueous solution 
(e.g., for reaction with hydrophilic biomolecules), and the other for 
reaction in organic solvents (e.g., chloroform) (e.g., for reaction with 

!0 hydrophobic biomolecules). Thus, in certain embodiments, the arrays 
herein discriminate between hydrophilic and hydrophobic biomolecules, 
including, but not limited to, proteins, and allow for analysis of both 
classes of biomolecules. 
C. Components of the arrays 

>5 Arrays with loci containing bound moieties (also referred to as 

capture agents) are provided. The arrays include a core M Z", the solid 
surface, that presents one or more reactivity functions "X" and optionally 
one or more of a selectivity function "Y", which are arrayed at the loci. 
The particular manner in which the functions are presented on the solid 
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surface (Z) is a matter of design choice, but are selected such that the 
resulting loci captures biomolecules, particularly proteins, with sufficient 
specificity and either covalently or with bonds of sufficient stablity or 
affinity to permit analysis, such as by mass spectrometry, including 
5 MALDI mass spectrometric analysis, so that at least a portion of bound 
biomolecules remain bound (generally a binding affinity of 10 9 , 10 10 , 10 11 
liters/mole or greater, or a K eq of 10 9 , 10 10 , 10 11 , 10 12 or greater). 

X, the reactivity functionality is selected to be anything that forms 
such a covalent bond or a bond of high affinity that is stable under 

10 conditions of mass spectrometric analysis, particularly MALDI analysis. 
The selectivity functionality Y, is a group that "looks" at the topology of 
the protein around reactivity binding sites and functions to select 
particular groups on biolmolecules from among those with which a 
reactivity group can form a covalent bond (or high affinity bond). For 

15 example a selectivity group can cause steric hindrance, or permit specific 
binding to an epitope, or anything in between. It can be a substrate for a 
drug, lipid, peptide. It selects the environment of the groups with which 
the reactivity function interacts. The selectivity functionality Y, can be 
one whereby a capture compound forms a covalent bond with a bio- 

20 molecule in a mixture or interacts with high stabilty such that the affinity 
of binding of the capture compound to the biomolecule through the 
reactive functionality in the presence of the selectivity functionality is at 
least ten-fold or 100-fold greater than in the absence of the selectivity 
functionality. 

25 Reactivity functions ("X") confer the ability on the 

compounds the ability to bind either covalently or with a high affinity 
(greater than 10 9 , generally greater than 10 10 or 10 11 liters/mole, typically 
greater than a monoclonal antibody, and typically stable to mass 
spectrometric analysis, such as MALDI-MS) to a biomolecule, particularly 
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proteins, including functional groups thereon, which include post- 
translationally added groups. Generally the binding is covalent or is of 
such affinity that it is stable under conditions of analysis, such as mass 
spectral, including MALDI-TOF, analysis. Exemplary groups are set forth 
5 herein. 

In the compounds provided herein, X is a moiety that binds to or 
interacts with the surface of a biomolecule, including, but not limited to, 
the surface of a protein; an amino acid side chain of a protein; or an 
active site of an enzyme (protein) or to functional groups of other 

10 biomolecule, including lipids and polysaccharides. 

Thus, for example, X is a group that reacts or interacts with 
functionalities on the surface of a protein to form covalent or non- 
covalent bonds with high affinity. A wide selection of different functional 
groups are available for X to interact with a protein. For example, X can 

15 act either as a nucleophile or an electrophile to form covalent bonds upon 
reaction with the amino acid residues on the surface of a protein. 
Exemplary reagents that bind covalently to amino acid side chains 
include, but are not limited to, protecting groups for hydroxyl, carboxyl, 
amino, amide, and thiol moieties, including, for example, those disclosed 

20 in T.W. Greene and P.G.M. Wuts, "Protective Groups in Organic 

Synthesis," 3rd ed. (1999, Wiley Interscience); photoreactive groups, 
Diels Alder couples {i.e., a diene on one side and a sngle double bond on 
the other side). Other groups for X include those described in co-pending 
U.S. application Serial No. 10/197,954 (see Figure 16). 

25 The selectivity functions ("Y") serves to modulate the reactivity 

function by reducing the number of groups to which the reactivity 
functions binds, such as by steric hindrance and other interactions. It is a 
group that modifies the steric and/or electronic (e.g., mesomeric, 
inductive effects) properties as well as the resulting affinities properties of 
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the capture compound. Selectivity functions include any functional 
groups that increase the selectivity of the reactivity group so that it binds 
to fewer different biomolecules than in the absence of the selectivity 
function or binds with greater affinity to biolmolecules than in its 
5 absence. In the capture compounds provided herein, Y is allowed to be 
extensively varied depending on the goal to be achieved regarding steric 
hindrance and electronic factors as they relate to modulating the 
reactivity of the cleavable bond L, if present, and the reactive 
functionality X. For example, a reactivity function X can be selected to 

10 bind to amine groups on proteins; the selectivity function can be selected 
to ensure that only groups exposed on the surface can be accessed. The 
selectivity function is such that the compounds bind to or react with (via 
the reactivity function) fewer different biomolecules when it is part of the 
molecule than when it is absent and/or the compounds bind with greater 

15 specificity and higher affinity The selectivity function can be attached 
directly to a compounds or can be attached via a linker, such as CH 2 C0 2 
or CH 2 -0-(CH 2 ) n -0, where n is an integer from 1 to 1 2, or 1 to 6, or 2 
to 4. Other groups for Y include those described in co-pending U.S. 
application Serial No. 10/197,954 (see Figure 17). 

20 In certain embodiments, each Y is independently a group that 

modifies the affinity properites and/or steric and/or electronic (e.g., 
mesomeric, inductive effects) properties of the resulting capture 
compound. For example, Y, in certain embodiments, is selected from 
ATP analogs and inhibitors; peptides and peptide analogs; 

25 polyethyleneglycol (PEG); activated esters of amino acids, isolated or 
within a peptide; cytochrome C; and hydrophilic trityl groups. 

In another embodiment, Y is a small molecule moiety, a natural 
product, a protein agonist or antagonist, a peptide or an antibody. In 
another embodiment, Y is a hydrophilic compound or protein (e.g., PEG or 
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trityl ether), a hydrophobic compound or protein {e.g., polar aromatics, 
lipids, glycolipids, phosphotriesters, oligosaccharides), a positive or 
negatively charged group, a small molecule, a pharmaceutical compound 
or a biomolecule that creates defined secondary or tertiary structures. 
5 More detailed description and discussion of each functionality in 

the description that follows. 

Exemplary compounds and descriptions of X and Y 
It is understood that for the gradient arrays provided here, "Z n is a 
solid support to which X, and Y (and any other groups) are bound or 
10 otherwise linked, directly or indirectly. 

In one embodiment, the compounds for use in the methods 

provided herein have formulae: 
Q-Z-X or Q'-Z-X or Q-Z-Y or Q'-Z-Y or X-Q-Y 
in which Q is a single stranded unprotected or suitably protected 
15 oligonucleotide or oligonucleotide analog {e.g., PIMA) of up to 50 building 
blocks, which is capable of hybridizing with a base-complementary single 
stranded nucleic acid molecule; 

Z is a solid support to which X are Y are linked; 
X is a functional group which interacts with and/or reacts with 
20 functionalities on the surface of a biomolecule, including, but not limited 
to, a protein, to form covalent bonds or bonds stable under conditions of 
mass spectrometric analysis as defined herein; and 

Y is a functional group which interacts with and/or reacts by 
imposing unique selectivity by introducing functionalities which interact 
25 noncovalently with target proteins. 

In another embodiment, the compounds for use in the methods 
provided herein have formula: 



30 
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Q-Z-X 
Y 

5 in which Q is a single stranded unprotected or suitably protected 

oligonucleotide or oligonucleotide analog {e.g., peptide nucleic acid (PNA)) 
of up to 50 building blocks, which is capable of hybridizing with a base- 
complementary single stranded nucleic acid molecule; 
Z is a solid support; 
10 X is a functional group which interacts with and/or reacts with 

functionalities on the surface of a biomolecule, including, but not limited 
to, a protein, to form covalent bonds or bonds stable under conditions of 
mass spectrometric analysis; and 

Y is a functional group which interacts with and/or reacts by 
15 imposing unique selectivity by introducing functionalities which interact 
noncovalently with target proteins. 

In another embodiment, Y is selected from ATP analogs and 
inhibitors; peptides and peptide analogs; polyethyleneglycol (PEG); 
activated esters of amino acids, isolated or within a peptide; cytochrome 
20 C; and hydrophilic trityl groups. 

■ 

In another embodiment, the compounds for use in the methods 
provided herein have the formulae: 



25 



°T (X)m or Q ~?- (X) ™ 

00n 00 B 



or Q'-Z-(X) m or 0-Z-(X) m or Q'-Z-(Y) n or Q'-Z-(Y) n , 

where Q r Q' f z, X and Y are as defined above; m is an integer from 1 to 
30 100, in one embodiment 1 to 10, in another embodiment 1 to 3, 4 or 5; 
and n is an integer from 1 to 100, in one embodiment 1 to 10, in another 
embodiment 1 to 3, 4 or 5. 
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In another embodiment, X is a pharmaceutical drug. Such 
compounds may be used as drug transport molecules. The compounds of 
these embodiments may be used in drug screening by capturing 
biomolecules, including but not limited to proteins, which bind to the 
5 pharmaceutical drug. Mutations in the biomolecules interfering with 
binding to the pharmaceutical drug are identified, thereby determining 
possible mechanisms of drug resistance. See, e.g., Hessler et al. 
(November 9-11, 2001) Ninth Foresight Conference on Molecular 
Nanotechno/ogyiAbstract) (http://www.foresight.Org/Conferences/MNT9/A 

1 0 bstracts/Hessler/) . 

In further embodiments, the compounds for use in the methods 
provided herein are those of the above formulae, where Z is an insoluble 
support or a substrate, including a bead, including, but not limited to, 
polymeric, magnetic, colored, R r tagged, etc. beads, that, in certain 

15 embodiments, is linked to X through a first optional spacer, and a 

cleavable linkage; and is linked to Q through a second optional spacer. In 
these embodiments, the density of the biopolymer to be analyzed, and 
thus signal intensity of the subsequent analysis, is increased relative to 
embodiments where Z is a divalent or multivalent group. In these 

20 embodiments, an appropriate array of single stranded oligonucleotides or 
oligonucleotide analogs that are complementary to Q will be employed in 
the methods provided herein. 

In further embodiments herein, Z is as defined above, and is a non- 
cleavable linker. The compounds of these embodiments are generally 

25 useful in embodiments where cleavage of the biopolymer from the Q 
moiety is not required prior to or during analysis of the biopolymer, 
including analysis of biopolymer-biopolymer interactions, such as protein- 
protein interactions, or biopolymer small molecule {e.g., drug or drug 
candidate) interactions, including protein-small molecule {e.g., drug or 
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drug candidate) interactions. 

In all embodiments herein, the compounds for use in the methods 
herein may be classified in at least two sets: one for reactions in aqueous 
solution (e.g., for reaction with hydrophilic biomolecules), and the other 
5 for reaction in organic solvents (e.g., chIoroform)(e.,g., for reaction with 
hydrophobic biomolecules). Thus, in certain embodiments, the 
compounds provided herein discriminate between hydrophilic and 
hydrophobic biomolecules, including, but not limited to, proteins, and 
allow for analysis of both classes of biomolecules. 
10 The variables Y, X and Z, with reference to linkers and spacers, are 

described in further detail below. The below descriptions apply to all of 
the above formulae. 

2. The moiety Z 

For purposes herein, Z is a solid support. The following includes a 
15 description of linkers and spacers for linking X and Y moieties to Z. 

a. Z is a cleavable moiety 
In certain embodiments herein, in the compounds for use in the 
methods provided herein, Z is a moiety that is cleavable prior to or during 
analysis of the biomolecule, including mass spectral analysis, without 
20 altering the chemical structure of the biomolecule, including, but not 
limited to, a protein. In one embodiment, the methods provided herein 
include methods of mass spectral analysis of biomolecules, including 
proteins, that are displayed in an addressable format. In certain 
embodiments, the format is an array of single stranded oligonucleotides 
25 that are complementary to the oligonucleotide portions, or oligonucleotide 
analog portions, (Q) of the compounds, in these embodiments, Z is a 
group that is (i) stable to the reaction conditions required for reaction of 
the compounds provided herein with the biomolecule, such as a protein, 
(ii) stable to the conditions required for hybridization of the Q moiety with 
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the single stranded oligonucleotides, and (iii) cleavable prior to or during 
analysis of the biomolecule. 

In one embodiment, Z is a photocleavable group that is cleaved by 
a laser used in MALDI-TOF mass spectrometry- In another embodiment, 
5 Z is an acid labile group that is cleaved upon application of a matrix to the 
hybridized compound-biomolecule conjugates, or by exposure to acids 
(e.g., trifluoroacetic or hydrochloric acids) in a vapor or liquid form, prior 
to analysis. In this embodiment, the matrix maintains the spacial integrity 
of the array, allowing for addressable analysis of the array. 

10 b. Z is a non-cleavable moiety 

In other embodiments herein, the compounds for use in the 
methods provided herein have a Z moiety that is not cleavable under 
conditions used for analysis of biomolecules, including, but not limited to, 
mass spectrometry, such as matrix assisted laser desorption ionization- 

15 time of flight (MALDI-TOF) mass spectrometry. The compounds of these 
embodiments are useful, e.g., in methods provided herein for determining 
biomolecule-biomolecule, including protein-protein, interactions, and for 
determining biomolecule-small molecule, including protein-drug or protein- 
drug candidate, interactions. In these embodiments, it is not necessary 

20 for the Z group to be cleaved for the analysis. 

c. Divalent or multivalent Z moieties 
In one embodiment, Z is a cleavable or non-cleavable divalent or 
multivalent group that contains less than 50, or less than 20 members, 
and is selected from straight or branched chain alkylene, straight or 

25 branched chain alkenylene, straight or branched chain alkynylene, straight 
or branched chain alkylenoxy, straight or branched chain alkylenthio, 
straight or branched chain alkylencarbonyl, straight or branched chain 
alkylenamino, cycloalkylene, cycloalkenylene, cycloalkynylene, 
cycloalkylenoxy, cycloalkylenthio, cycloalkylencarbonyl, 
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cycloalkylenamino, heterocyclylene, arylene, arylehoxy, arylenthio, 
arylencarbonyl, arylenamino, heteroarylene, heteroarylenoxy, 
heteroarylenthio, heteroarylencarbonyl, heteroarylenamino, oxy, thio, 
carbonyl, carbonyloxy, ester, amino, amido, phosphino, phosphineoxido, 
5 phosphoramidato, phosphinamidato, sulfonamide, sulfonyl, sulfoxide 

carbamate, ureido, and combinations thereof, and is optionally substituted 
with one or more, including one to four, substituents each independently 

selected from R 15 ; 

each R 15 is independently a group that modifies the steric and/or 
10 electronic (e.g., mesomeric, inductive effects) properties of Z; in one 
• embodiment, R 15 is a group that is a component of a luminescent, 
including fluorescent, phosphorescent, chemiluminescent and 
bioluminescent, system, or is a group that may be detected in a 

colorimetric assay. 

15 Fluorescent, colorimetric and phosphorescent groups are well 

known to those of skill in the art (see, e.g., U.S. Patent No. 6,274,337; 
Sapan eta/. (1999) BiotechnoL Appl. Biochem. 29 (Pt. 2;:99-108; 
Sittampalam et al. (1997) Curr. Opin. Chem. Biol. 7/5/:384-91; Lakowicz, 
J. R., Principles of Fluorescence Spectroscopy, New York: Plenum Press 

20 (1983); Herman, B., Resonance Energy Transfer Microscopy, in: 

Fluorescence Microscopy of Living Cells in Culture, Part B, Methods in 
Cell Biology, vol. 30, ed. Taylor, D. L. & Wang, Y. -L., San Diego: 
Academic Press (1989), pp. 219-243; Turro, N. J., Modern Molecular 
Photochemistry, Menlo Park: Benjamin/Cummings Publishing Col, Inc. 

25 (1978), pp. 296-361 and the Molecular Probes Catalog (1997), OR, 
USA). Fluorescent moieties include, but are not limited to, 1- and 2- 
aminonaphthalene, p^'-diaminostilbenes, pyrenes, quaternary 
phenanthridine salts, 9-aminoacridines, p,p'-diaminobenzophenone imines, 
anthracenes, oxacarbocyanine, merocyanine, 3-aminoequilenin, perylene, 
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bis-benzoxazole, bis-p-oxazolyl benzene, 1,2-benzophenazin, retinol, bis- 
3-aminopyridinium salts, hellebrigenin, tetracycline, sterophenol, 
benzimidazolylphenylamine, 2-0X0-3-0111*017160, indole, xanthen, 7- 
hydroxycoumarin, phenoxazine, calicylate, strophanthidin, porphyrins, 
5 triarylmethanes and flavin. Fluorescent compounds which have 

functionalities for linking to a compound provided herein, or which can be 
modified to incorporate such functionalities include, e.g., dansyl chloride; 
fluoresceins such as 3,6-dihydroxy-9-phenylxanthhydrol; rhodamineiso- 
thiocyanate; N-phenyl 1-amino-8-sulfonatonaphthalene; N-phenyl 2- 

10 amino-6-sulfonatonaphthaIene; 4-acetamido-4-isothiocyanato-stilbene- 
2,2'-disulfonic acid; pyrene-3-sulfonic acid; 2-toluidinonaphthalene-6- 
sulfonate; N-phenyl-N-methyl-2-aminoaphthalene-6-sulfonate; ethidium 
bromide; stebrine; auromine-0,2-(9'-anthroyl)palmitate; dansyl 
phosphatidylethanolamine; N,N'-dioctadecyl oxacarbocyanine: N,N'- 

15 dihexyl oxacarbocyanine; merocyanine, 4-(3'pyrenyl)stearate; d-3- 

aminodesoxy-equilenin; 12-(9'-anthroyl)stearate; 2-methylanthracene; 9- 
vinylanthracene; 2,2 / (vinylene-p-phenylene)bisbenzoxazole; p-bis{2-(4- 
methyl-5-phenyl-oxazolyl)) benzene; 6-dimethylamino-1,2-benzophenazin; 
retinol; bis(3'-aminopyridinium) 1 ,10-decandiyl diiodide; 

20 sulfonaphthylhydrazone of hellibrienin; chlorotetracycline; N-(7- 
dimethylamino4-methyl-2-oxo-3-chromenyl)maleimide; N-(p-(2- 
benzimidazolyl)-phenyl)maleimide; N-(4-fluoranthyl)maleimide; 
bis(homovanillic acid); resazarin; 4-chloro-7-nitro-2,1,3-benzooxadiazole; 
merocyanine 540; resorufin; rose bengal; and 2,4-diphenyl-3(2H)- 

25 furanone. Many fluorescent tags are commercially available from SIGMA 
chemical company (Saint Louis, Mo.), Molecular Probes, R&D systems 
(Minneapolis, Minn.), Pharmacia LKB Biotechnology. (Piscataway, N.J.), 
CLONTECH Laboratories, Inc. (Palo Alto, Calif.), Chem Genes Corp., 
Aldrich Chemical Company (Milwaukee, Wis.), Glen Research, Inc., 
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GIBCO BRL Life Technologies, Inc. (Gaithersberg, Md.), Fluka Chemica- 
Biochemika Analytika (Fluka Chemie AG, Buchs, Switzerland), and 
Applied Biosystems (Foster City, Calif.) as well as other commercial 
sources known to one of skill in the art. 
5 Chemiluminescent groups intended for use herein include any 

components of light generating systems that are catalyzed by a 
peroxidase and require superoxide anion (0 2 ) (and/or hydrogen peroxide 
(H 2 0 2 ))(see, e.g., Musiani eta/. (1998) Histol. Histopathol. 13(1):243-S). 
Light-generating systems include, but are not limited to, luminol, 

10 isoluminol, peroxyoxalate-fluorophore, acridinium ester, lucigenin, 

dioxetanes, oxalate esters, acridan, hemin, indoxyl esters including 3-0- 
indoxyl esters, naphthalene derivatives, such as 7-dimethylamino- 
naphthalene-1,2-dicarbonic acid hydrazide and cypridina luciferin analogs, 
including 2-methyl-6-[p-methoxyphenyl]-3,7-dihyroimidazo[1 ,2-a]pyrazin- 

15 3-one, 2-methyl-6-phenyl-3,7-dihyroimidazo[1,2-a]pyrazin-3-one and 2- 
methyl-6-[p-[2-[sodium 3-carboxylato-4-(6-hydroxy-3-xanthenon-9- 
yllphenylthioureylenelethyleneoxylphenyll-S^-dihyroimidazotl^- 
a]pyrazin-3-one. In other embodiments, the chemiluminescent moieties 
intended for use herein include, but are not limited to, luminol, isoluminol, 

20 N-(4-aminobutyl)-N-ethyl isoluminol (ABEI), N-(4-aminobutyl)-N-methyl 
isoluminol (ABMI), which have the following structures and participate in 
the following reactions: 



R O R O 




in which luminol is represented, when R is NH 2 and R 1 is H; isoluminol, 
when R is H and R 1 is NH 2 ; for ABEI ((6-[N-(4-aminobutyl)-N-ethylamino] 
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2,3-dihyrophthalazine-1-4-dione), when R is H and R 1 is C 2 H 5 -N- 
(CH 2 ) 4 NH 2 ; and for ABMI ((6-[N-(4-aminobutyl)-N-methylamino]-2,3- 
dihyrophthalazine-1-4-dione), when R is H and R 1 is CH 3 -N-(CH 2 ) 4 NH 2 . 
Bioluminescent groups for use herein include luciferase/luciferin 
5 couples, including firefly [Photinus pyra/is] lucif erase, the Aequorin 
system (i.e., the purified jellyfish photoprotein, aequorin). Many 
luciferases and substrates have been studied and well-characterized and 
are commercially available (e.g., firefly luciferase is available from Sigma, 
St. Louis, MO, and Boehringer Mannheim BiochemicaIs,lndianapolis, IN; 

10 recombinantly produced firefly luciferase and other reagents based on this 
gene or for use with this protein are available from Promega Corporation, 
Madison, Wl; the aequorin photoprotein luciferase from jellyfish and 
luciferase from Renilla are commercially available from Sealite Sciences, 
Bogart, GA; coelenterazine, the naturally-occurring substrate for these 

15 luciferases, is available from Molecular Probes, Eugene, OR]. Other 
bioluminescent systems include crustacean, particularly Cyrpidina 
(Vargula), systems; insect bioluminescence generating systems including 
fireflies, click beetles, and other insect systems; bacterial systems; 
dinoflagellate bioluminescence generating systems; systems from 

20 molluscs, such as Latia and Phofas; earthworms and other annelids; glow 
worms; marine polycheate worm systems; South American railway beetle; 
fish {i.e., those found in species of Aristostomias, such as A. scintillans 
(see, e.g., O'Day eta/. (1974) Vision Res. 74:545-550), Pachystomias, 
and Malacosteus, such as M. niger, blue/green emmitters include 

25 cyclthone, myctophids, hatchet fish (agyropelecus), vinciguerria, howella, 
florenciella, and Chauliodus); and fluorescent proteins, including green 
{i.e., GFPs, including those from Renilla and from Ptilosarcus), red and 
blue (i.e., BFPs, including those from Vibrio fischeri, Vibrio harveyi or 
Photobacterium phosphoreum) fluorescent proteins (including Renilla 
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mulleri luciferase, Gauss/a species luciferase and Pleuromamma species 
luciferase) and phycobiliproteins. 

i. Cleavable divalent or multivalent Z moieties 
In one embodiment, Z is a cleavable divalent or multivalent moiety 
5 and has the formula: 
-(SVM(R 15 )a-<S 2 ) b -L- 

in which S 1 and S 2 are spacer moieties; t and b are each independently 0 
or 1; M is a central moiety possessing two or more points of attachment 
(i.e., divalent or higher valency); in certain embodiments, two to six 

10 points of attachment (i.e., divalent to hexavalent), in other embodiments, 
2, 3, 4 or 5 points of attachment (i.e., divalent, trivalent, tetravalent or 
pentavalent); R 15 is a functional group that modifies the steric and/or 
electronic (e.g., mesomeric, inductive effects) properties of M; a is O to 4, 
in certain embodiments, 0, 1 or 2; and L is a bond that is cleavable prior 

15 to or during analysis, including mass spectral analysis, of a biomolecule 
without altering the chemical structure of the biomolecule, such as a 
protein. 

A spacer region S 1 and/or S 2 can be present on either or both sides 
of the central moiety M of the compounds, thereby reducing steric 

20 hindrance in reactions with the surface of large protein molecules (S 2 ) or 
facilitating hybridization with complementary sequences (S 1 ). For 
embodiments wherein the protein and the complementary sequence 
possess low steric hinderance, a spacer may not be required. In certain 
embodiments, steric hindrance can enhance selectivity. This enhanced 

25 selectivity can be achieved either by the presence of one or more bulky 
substituents R 15 which is attached to M or by the selection of the 
appropriate spacer molecules for S 1 and/or S 2 . 

In certain embodiments, S 1 and S 2 are each independently selected 
from -{CH 2 ) r , -(CH 2 OK -(CH 2 CH 2 -0) r -,-{NH-(CH 2 ) r -C( = 0)) s -, 
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-{IMH-CH{R 52 )-C( = 0)) s -, -{0-(CH) r -C{ = 0)> s -, 




where R 16 is as above; r and s are each independently an integer from 1 
to 10; R 52 is the side chain of a natural a-amino acid; and y is an integer 
5 from 0 to 4. In one embodiment, y is 0 or 1 . 

In certain embodiments, R 15 is -H, -OH, -OR 51 , -SH, -SR 51 , -NH 2 , 
-NHR 51 , -NR 51 2 , -F, -CI, -Br, -I, -S0 3 H, -P0 2 4 , -CH 3 , -CH 2 CH 3 , -CH(CH 3 ) 2 or 
-C(CH 3 ) 3 ; where R 51 is straight or branched chain alkyl, straight or 
branched chain alkenyl, straight or branched chain alkynyl, aryl, 

10 heteroaryl, cycloalkyl, heterocyclyl, straight or branched chain aralkyl, 
straight or branched chain aralkenyl, straight or branched chain aralkynyl, 
straight or branched chain heteroaralkyl, straight or branched chain 
heteroaralkenyl, straight or branched chain heteroaralkynyl, straight or 
branched chain cycloalkylalkyl, straight or branched chain 

15 cycloalkylalkenyl, straight or branched chain cycloalkylalkynyl, straight or 
branched chain heterocyclylalkyl, straight or branched chain 
heterocyclylalkenyl or straight or branched chain heterocyclylalkynyl. 

In certain embodiments, the cleavable group L is cleaved either 
prior to or during analysis of the biomolecule, such as a protein. The 

20 analysis may include mass spectral analysis, for example MALDI-TOF 
mass spectral analysis. The cleavable group L is selected so that the 
group is stable during conjugation to a biomolecule, hybridization of the Q 
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moiety to a complementary sequence, and washing of the hybrid; but is 
susceptable to cleavage under conditions of analysis of the biomolecule, 
including, but not limited to, mass spectral analysis, for example MALDI- 
TOF mass spectral analysis- In certain embodiments, the cleavable group 
5 L can be a disulfide moiety, created by reaction of the compounds where 
X = -SH, with the thiol side chain of cysteine residues on the surface of 
biomolecules, including, but not limited to, proteins. The resulting 
disulfide bond can be cleaved under various reducing conditions including, 
but not limited to, treatment with dithiothreitol and 2-mercaptoethanol. 

10 In another embodiment, L is a photocleavable group, which can be 

cleaved by a short treatment with UV light of the appropriate wave length 
either prior to or during mass spectrometry- Photocleavable groups, 
including those bonds which can be cleaved during MALDI-TOF mass 
spectrometry by the action of a laser beam, may be used. For example, a 

15 trityl ether or an ortho nitro substituted aralkyl, including benzyl, group 
are susceptible to laser induced bond cleavage during MALDI-TOF mass 
spectrometry. Other useful photocleavable groups include, but are not 
limited to, o-nitrobenzyl, phenacyl, and nitrophenylsulfenyl groups. 



20 in International Patent Application Publication No. WO 98/20166. In one 
embodiment, the photocleavable groups have formula I: 



Other photocleavable groups for use herein include those disclosed 



25 



30 




(I) 



where R 20 is w-O-alkylene-; R 21 is selected from hydrogen, alkyl, aryl. 
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10 



alkoxycarbonyl, aryloxycarbonyl and carboxy; t is 0-3; and R 50 is alkyl, 
alkoxy, aryl or aryloxy. In one embodiment, Q is attached to R 20 through 
(SMt-MtR^Ja-tS 2 )^" and the biomolecule of interest is captured onto the 
R 21 CH-0- moiety via a reactive derivative of the oxygen {e.g., X). 

In another embodiment, the photocleavable groups have formula II: 



15 




(ID 



NO 



where R 20 is oz-O-alkylene- or alkylene; R 21 is selected from hydrogen, 

alkyl, aryl, alkoxycarbonyl, aryloxycarbonyl and carboxy; and X 2 ° is 
20 hydrogen, alkyl or OR 21 . In one embodiment, Q is attached to R 20 through 

(SVM(R 15 ) a -(S 2 ) b ; and the biomolecule of interest is captured onto the 

R 21 CH-0- moiety via a reactive derivative of the oxygen {e.g., X). 

In further embodiments, R 20 is -0-(CH 2 ) 3 - or methylene; R 21 is 

selected from hydrogen, methyl and carboxy; and X 20 is hydrogen, methyl 
25 or OR 21 . In another embodiment, R 21 is methyl; and X 20 is hydrogen. In 

certain embodiments, R 20 is methylene; R 21 is methyl; and X 20 is 3-(4,4'- 

dimethoxytrityloxy)propoxy. 

In another embodiment, the photocleavable groups have formula III: 



30 



35 
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5 



10 




where R 2 is selected from c^-O-alkylene-O and oz-O-alkylene-, and is 
unsubstituted or substituted on the alkylene chain with one or more alkyl 

15 groups; c and e are each independently 0-4; and R 70 and R 71 are each 

independently alkyl, alkoxy, aryl or aryloxy. In certain embodiments, R 2 is 
af-O-alkylene-, and is substituted on the alkylene chain with a methyl 
group. In one embodiment, Q is attached to R 2 through (SV M < Rl5 >a-( s2 )b/" 
and the biomolecule of interest is captured onto the Ar 2 CH-0- moiety via 

20 a reactive derivative of the oxygen {e.g., X). 

In further embodiments, R 2 is selected from 3-0-(CH 2 ) 3 -0-, 
4-0-(CH 2 ) 4 -, 3-0-(CH 2 ) 3 -, 2-0-CH 2 CH 2 -, -OCH 2 -, 



Me 




In other embodiments, c and e are 0. 

30 Other cleavable groups L include acid sensitive groups, in which 

bond cleavage is promoted by formation of a cation upon exposure to 
mild to strong acids. For these acid-labile groups, cleavage of the group L 
can be effected either prior to or during analysis, including mass 
spectrometric analysis, by the acidity of the matrix molecules, which are 

35 deposited as a thin layer on top of the hybrids or by applying a short 
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treatment of the array with an acid, such as the vapor of trifluoroacetic 
acid. Exposure of a trityl group to acetic or trifluoroacetic acid produces 
cleavage of the ether bond either before or during MALDI-TOF mass 
spectrometry. 

5 The compound-biomolecule array can be treated by either chemical, 

including, but not limited to, cyanogen bromide, or enzymatic, including, 
but not limited to, trypsin, chymotrypsin, an exopeptidase {e.g., 
aminopeptidases and carboxypeptidases) reagents to effect cleavage. For 
the latter, all but one peptide fragment will remain hybridized when 
10 digestion is quantitative. Partial digestion may also be of advantage to 
identify and characterize proteins following desorption from the array. 
The cleaved protein/peptide fragments are desorbed, analyzed, and 
characterized by their respective molecular weights. 

In certain embodiments herein, L is selected from -S-S-, 



15 -0-P( = 0)(OR 51 )-NH-, -0-C( = 0)-, 
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where R 15 , R 51 and y are as defined above. In certain embodiments, R 15 is 
-H, -OH, -OR 51 , -SH, -SR 51 , -NH 2 , -NHR 51 , -N(R 51 ) 2/ -F, -CI, -Br, -I, -S0 3 H, 
-PO\, -CH 3 , -CH 2 CH 3 , -CH(CH 3 ) 2 or -C(CH 3 ) 3 ; where R 51 is straight or 
branched chain alkyl, straight or branched chain alkenyl, straight or 
5 branched chain alkynyl, aryl, heteroaryl, cycloalkyl, heterocyclyl, straight 
or branched chain aralkyl, straight or branched chain aralkenyl, straight or 
branched chain aralkynyl, straight or branched chain heteroaralkyl, 
straight or branched chain heteroaralkenyl, straight or branched chain 
heteroaralkynyl, straight or branched chain cycloalkylalkyl, straight or 
branched chain cycloalkylalkenyl, straight or branched chain 
cycloalkylalkynyl, straight or branched chain heterocyclylalkyl, straight or 
branched chain heterocyclylalkenyl or straight or branched chain 
heterocyclylalkynyl. 

ii. Non-cleavable divalent Z moieties 
In another embodiment, Z is a non-cleavable divalent moiety and 
has the formula: 
-{SVM(R 15 ) a -<S 2 ) b - 

in which S 1 , M, R 15 , S 2 , t, a and b are as defined above. 

e. Z is an insoluble support or a substrate 
For the gradient arrays provided herein, Z is an insoluble support or 
a substrate. In these embodiments, Z is linked to a plurality of X moieties 
or a plurality of Y and X moieties or mixtures of X, Q and/or Y and other 
moieties. Z, in certain embodiments, has tens up to hundreds, 
thousands, millions, or more points of attachment. In these 
embodiments, the insoluble support or substrate moiety Z is based on a 
flat surface constructed, for example, of glass, silicon, metal, plastic or a 
composite; or can be in the form of a bead such as a silica gel, a 
controlled pore glass, a magnetic or cellulose bead; or can be a pin, 
including an array of pins suitable for combinatorial synthesis or analysis. 
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Substrates can be fabricated from virtually any insoluble or solid material. 
For example, silica gel, glass (e.g., controlled-pore glass (CPG)), nylon, 
Wang resin, Merrifield resin, dextran cross— linked with epichlorohydrin 
(e.g., Sephadex R ), agarose (e.g., Sepharose R ), cellulose, magnetic beads, 

* 

5 Dynabeads, a metal surface (e.g., steel, gold, silver, aluminum, silicon 
and copper), a plastic material {e.g., polyethylene, polypropylene, 
polyamide, polyester, polyvinylidenedifluoride (PVDF)) Exemplary 
substrates include, but are not limited to, beads (e.g., silica gel, controlled 
pore glass, magnetic, dextran cross— linked with epichlorohydrin (e.g., 

10 Sephadex R ), agarose (e.g., Sepharose R ), cellulose), capillaries, flat 

supports such as glass fiber filters, glass surfaces, metal surfaces (steel, 
gold, silver, aluminum, copper and silicon), plastic materials including 
multiwell plates or membranes (e.g., of polyethylene, polypropylene, 
polyamide, polyvinylidenedifluoride), pins (e.g., arrays of pins suitable for 

15 combinatorial synthesis or analysis or beads in pits of flat surfaces such 
as wafers (e.g., silicon wafers) with or without plates. The solid support 
is in any desired form, including, but not limited to, a bead, capillary, 
plate, membrane, wafer, comb, pin, a wafer with pits, an array of pits or 
nanoliter wells and other geometries and forms known to those of skill in 

20 the art. Supports include flat surfaces designed to receive or link samples 
at discrete loci. In one embodiment, flat surfaces include those with 
hydrophobic regions surrounding hydrophilic loci for receiving, containing 
or binding a sample. 

In one embodiment, the solid supports or substrates Z are beads, 

25 including, but not limited to, polymeric, magnetic, colored, R f -tagged, etc. 
beads. The beads may be made from hydrophobic materials, including, 
but not limited to, polystyrene, polyethylene, polypropylene or teflon, or 
hydrophilic materials, including, but not limited to, cellulose, dextran 
cross — linked with epichlorohydrin (e.g., Sephadex R ), agarose (e.g., 
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Sepharose R ), polyacrylamide, silica gel and controlled pore glass. 

In further embodiments, the insoluble support or substrate Z 
moieties may optionally possess spacer groups S 1 and/or S 2 , or for 
embodiments where Z is a cleavable linkage, L. The S\ S 2 and/or L 
5 moieties are attached to the surface of the insoluble support or substrate. 
In these embodiments, the density of the biopolymer to be 
analyzed, and thus signal intensity of the subsequent analysis, is 
increased relative to embodiments where Z is a divalent or multivalent 
group. In these embodiments, an appropriate array of single stranded 

10 oligonucleotides or oligonucleotide analogs that are complementary to Q 
will be employed in the methods provided herein, 

In another embodiment, Z is a support that, when functionalized 
with Q, X and/or Y, mimics the function of an artificial membrane by 
capturing biomolecules, including but not limited to, proteins. In one 

15 embodiment, Z is a support that, when functionalized with Q, X and/or Y, 
mimics the specificity of the inside of a cell membrane. In this 
embodiment, the support is able to capture proteins and other 
biomolecules that the compounds provided herein would otherwise not be 
able to capture, including proteins within cell membranes. In one 

20 embodiment, the supports function under physiological conditions. Thus, 
in the above embodiments, choice of Z, Q, X and/or Y allows for design 
of surfaces and supports that mimic cell membranes and other biological 
membranes. When the compounds provided herein act as an artificial 
membrane, dendrimer polymer chemistry may be employed for controlled 

25 synthesis of membranes having consistent pore dimensions and 

membrane thicknesses, through synthesis of amphiphilic dedndrimeric or 
hyperbranched block copolymers that can be self-assembled to form 
ultrathin organic film membranes on porous supports. 
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Gradient arrays 

Provided herein are two dimentional gradient arrays. In these 
embodiments, Z is a solid support and X, and optionally, Y moieties 
(separately or linked to Z) are arrayed thereon such that the collection of 
5 loci presents a gradient of selected properties. At each locus in the array 
X and Y moieties are presented, such as side-by-side or bound, as along 
each row {i.e., X axis), each Y moiety is the same and a selected property 
of X, such as hydrophobicity, lipophilisity, charge, size, specificity or 
other property of each X is altered in a predetermined manner along the 

10 X-axis to provide a gradient of such property. Similarly, along each 
column of loci in the array, a property of Y, such as hydrophobicity, 
charge, size, specificity or other property of each Y is altered in a 
predetermined manner along the Y-axis to provide a gradient of such 
property. For each array, the property of X moieties that is altered is not 

15 the same as the property of the Y moieties that is altered. The resulting 
array thereby presents a two dimentional gradient of two (or more) 
selected properties and presents loci with different affinities. The number 
of loci in the array can be any number, such as 10, 50, 100, 500, 1000, 
10 4 and more or any predetermined number. 

20 Such solid supports can be any solid support, such as those 

provided herein, and include beads, flat squares and rectangles and other 
geometries, and can be any size desired, such a 1 //m, 10 //m, 50 //m, 
100 //m, 1 mm, 1 cm, 10 cm and larger, where the dimension refers to 
the largest dimension, and for purposes herein is of a size convenient to 

25 perform mass spectrometric analyses, particularly MALDI analyses. In 

such arrays, the X and Y moieties can be provided such that they present 
a gradient of selected properties. 

In some embodiments, Y moiety can be optional, and instead two 
properties of X moieties are altered along the axes of the array. For 
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example, after arraying X t - X n , in a two dimensional array, where each X 
moiety differs, for example in hydrophobicity, the arrayed molecules are 
treated, such as by exposure to differing amounts of light or current on a 
gradient along Y axis, to alter properties, such as charge or further 
5 alterations in hydrophobicity, of each X moiety along the Y axis. 

In practice, the resulting arrays are analyzed as described herein. 
For example, the resulting arrays are contacted with a sample, such as a 
sample containing proteins, to be analyzed and, matrix is added and the 
resulting array analyzed by mass spectrometry. Exemplary gradient 

10 arrays are set forth below. 

Figure 5 shows a continuous support with gradients or lyotropic 
monolayers where the molecules are increasingly hydrophobic/hydrophilic, 
charged (negative or positive), increasing/decreasing in size. Other 
properties well known to those of skill in the art may also be used, such 

15 as chemical properties, including reagents specificity for NH 2 , SH, SS, 
OH, etc. groups. In one embodiment, the gradient along the X axis 
differs from the Y axis, e.g., X = increasing hydrophobicity and Y = 
increasing positive charge. In one embodiment, azobenzene switches 
{see, e.g., Figure 6) are used wherein hydrophilicity is introduced as the 

20 azobenzene group is exposed to light. A gradient of 

hydrophobicity/hydrophilicity is formed as the light exposure increases. 
Charge may be introduced using electrodes which generate a gradient of 
electronic charges. See, e.g., Figure 7. Alternatively, Along each 
direction (X and Y), there will be many patches with different affinities. 

25 The mixture of proteins is applied to the surface of the substrate, 

whereby the proteins arrange according to their unique affinities. The 
matrix is then spotted atop of each area and analyzed by, e.g., MALDI- 
TOF. 
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f. 



Mass modified Z moieties 



In further embodiments, including embodiments where Z includes a 
cleavable moiety, Z can include a mass modifying tag. In certain 
embodiments, the mass modifying tag is attached to the cleavable linker 
5 L. In one embodiment, the mass modified Z moiety has the formula: 
-<SVM(R 15 ) a -(S 2 ) b -L-T- 

in which S\ t, M, R 15 , a, S 2 , b and L are selected as above; and T is a 
mass modifying tag. Mass modifying tags for use herein include, but are 
not limited to, groups of formula -X'R 10 -, where X 1 is a divalent group 
10 such as -O-, -0-C(0)-{CH 2 ) v -C(0)0-, -NH-C(O)-, -C(0)-NH-, 

-NH-C(0)-(CH 2 ) y -C(0)0-, -NH-C(S)-NH-, -0-P(0-alkyl)-0-, -0-S0 2 -0-, 
-0-C(0)-CH 2 -S-, -S-, -NH- and 



20 ° Me 

and R 10 is a divalent group including -(CH 2 CH20) 2 -CH 2 CH 2 0-, 
-(CH 2 CH 2 0) 2 -CH 2 CH 2 0-alkylene, alkylene, alkenylene, alkynylene, arylene, 
heteroarylene, -(CH 2 ) 2 -CH 2 -0-, -(CH 2 ) 2 -CH 2 -0-alkylene, 
25 -(CH 2 CH 2 NH) 2 -CH 2 CH 2 NH-, -CH 2 -CH(OH)-CH 2 0-, -Si(R 12 )(R 13 )-, -CHF- and 
-CF 2 -; where y is an integer from 1 to 20; z is an integer from O to 200; 
R 11 is the side chain of an a-amino acid; and R 12 and R 12 are each 
independently selected from alkyl, aryl and aralkyl. 



30 -(NH-(CH 2 ) y -NH-C{0)-(CH 2 ) y -C(0)) 2 -NH-(CH 2 ) y -NH-C(0)-(CH 2 ) y -C(0)0-, 
-{NH-(CH 2 ) y -C{0)) 2 -NH-{CH 2 ) y -C(0)0-, 
-(NH-CHtR^J-CfOJJ.-NH-CHtR^J-ClOO-, and 



15 




In other embodiments, -X 1 R 10 - is selected from -S-S-, -S-, 
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-(0-(CH 2 ) y -C(0)) z -NH-(CH 2 ) y -C{0)0-. 

In the above embodiments, where R 10 is an oligo-/polyethylene 
glycol derivative, the mass-modifying increment is 44, i.e., five different 
mass-modified species can be generated by changing z from 0 to 4, thus 
5 adding mass units of 45 (z = 0), 89 (z = 1), 133 (z = 2), 177 (z = 3) 
and 221 (z = 4) to the compounds. The oligo/polyethylene glycols can 
also be monoalkylated by a lower alkyl such as methyl, ethyl, propyl, 
isopropyl, t-butyl and the like- 
Other mass modifying tags include, but are not limited to -CHF-, 

« 

10 -CF 2 -, -Si(CH 3 ) 2 -, -Si(CH 3 )(C 2 H 5 )- and -Si(C 2 H 5 ) 2 . In other embodiments, 
the mass modifying tags include homo- or heteropeptides. A non-limiting 
example that generates mass-modified species with a mass increment of 
57 is an oligoglycine, which produce mass modifications of, e.g., 74 (y = 
1 , z = O), 1 31 (y = 1 , z = 2), 1 88 (y = 1 , z = 3) or 245 (y = 1 , z = 

15 4). Oligoamides also can be used, e.g., mass-modifications of 74 (y = 1, 
z = 0), 88 (y = 2, z = 0), 102 (y = 3, z = 0), 1 1 6 (y = 4, z = 0), 
etc., are obtainable. Those skilled in the art will appreciate that there are 
numerous possibilities in addition to those exemplefied herein for 
introducing, in a predetermined manner, many different mass modifying 

20 tags to the compounds provided herein. 

In other embodiments, R 15 and/or S 2 may be functionalized with 
-X 1 R 10 H or -X 1 R 10 -alkyl, where X 1 and R 10 are defined as above, to serve 
as mass modifying tags. 

3. The moiety X 

25 In the compounds provided herein, X is a moiety that binds to or 

interacts with the surface of a biomolecule, including, but not limited to, 
the surface of a protein; an amino acid side chain of a protein; or an 
active site of an enzyme (protein). Thus, in certain embodiments X is a 
group which can react or interact with functionalities on the surface of a 
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protein to form covalent or non-covalent bonds. A wide selection of 
different functional groups are available for X to interact with a protein. 
For example, X can act either as a nucleophile or an electrophile to form 
covalent bonds upon reaction with the amino acid residues on the surface 
5 of a protein. Reagents that bind to amino acid side chains include, but 
are not limited to, protecting groups for hydroxyl, carboxyl, amino, amide, 
and thiol moieties, including those disclosed in T.W. Greene and P.G.M. 
Wuts, "Protective Groups in Organic Synthesis," 3rd ed. (1999, Wiley 
Interscience). These protecting groups can react with amino acid side 
10 chains such as hydroxyl (serine, threonine, tyrosine); amino (lysine, 
arginine, histadine, proline); amide (glutamine, asparagine); carboxylic 
acid (aspartic acid, glutamic acid); and sulfur derivatives (cysteine, 
methionine), and are readily adaptable for use in these compounds as the 
reactive moiety X. 

15 It is noteworthy that in addition to the wide range of group-specific 

reagents which are known to persons of skill in the art, reagents which 
are known in natural product chemistry can also serve as a basis for X in 
forming covalent linkages. 

In other embodiments, X is a protein purification dye, such as 

20 acridine or methylene blue, that has a strong affinity for certain proteins. 

Alternatively, X can act as an electron donor or an electron 
acceptor to form non-covalent bonds or a complex, such as a charge- 
transfer complex, with a biomolecule, including, but not limited to, a 
protein. These reagents include those which interact strongly and with 

25 high specificity with biomolecules, including, but not limited to, proteins, 
without forming covalent bonds through the interaction of complementary 
affinity surfaces. For example, well known complexes such as biotin- 
streptavidin, antibody-antigen, receptor-ligand, lectin-carbohydrate and 
other similar types of reagents, are readily adaptable for use in these 
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compounds as the reactive moiety X. Further, these complexes have 
strong interactions which are stable enough to allow for a suitable 
washing of the unbound biomolecules, including, but not limited to, 
proteins, out of the complexed biological mixtures. 
5 If S 2 is not present, the reactivity of X can be influenced by one or 

more substituted functionalities, for example, R 15 on M. Electronic {e.g., 
mesomeric, inductive) and/pr steric effects can be used to modulate the 
reactivity of X and the stability of the resulting X-biomolecule linkage. In 
these embodiments, subsets of biomolecular mixtures, including, but not 
10 limited to, protein mixtures, can react and be analyzed due to the 

modulation by R 15 , which changes the electronic or steric properties of X 
and, therefore, increases the selectivity of the reaction of X with the 
biomolecule. 

In these embodiments, X is an active ester, such as 
15 -C( = 0)0-Ph-pN0 2 , -C( = 0)0-C 6 F 5 or -C( = O)-O-(N-succinimidyl); an active 
halo moiety, such as an or-halo ether or an ar-halo carbonyl group, 
including, but not limited to, -OCH 2 -l, -OCH 2 -Br, -OCH 2 -CI, -C(0)CH 2 l, 
-C(0)CH 2 Br and -C(0)CH 2 CI; amino acid side chain-specific functional 
groups, such as maleimido (for cysteine), a metal complex, including gold 
20 or mercury complexes (for cysteine or methionine), an expoxide or 

isothiocyanate (for arginine or lysine); reagents which bind to active sites 
of enzymes, including, but not limited to, transition state analogs; ligands 
which bind to receptors, such as insulin; specific peptides which bind to 
biomolecule surfaces, including glue peptides; lectins (e.g., mannose 
25 type, lactose type); antibodies, e.g., against phosphorylated peptides; 
antigens, such as a phage display library; haptens; biotin; avidin; or 
streptavidin. Other embodiments of X are well known to those of skill in 
the art and include those disclosed in Techniques in Protein Chemistry, 
Vol. 1 (1989) T. Hugli ed. (Academic Press); Techniques in Protein 
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Chemistry, Vol. 5 (1994) J.W. Crabb ed. (Academic Press); Lundblad 
Techniques in Protein Modification (1995) (CRC Press, Boca Raton, FL); 
Glazer eta/. (1976) Chemical Modification of Proteins (North Holland 
(Amsterdam)) (American Elsevier, New York); and Hermanson (1996) 
5 Bioconjugate Techniques (Academic Press, San Diego, CA). 
4. Further embodiments 

In certain embodiments, the compounds provided herein have the 
formula: 

N l m -B r N 2 n -(S 1 ) r M(R 15 ) a -(S 2 ) b -L-X 
10 in which N 1 , B, N 2 , S\ M, S 2 , L, X, m, i, n, t, a and b are as defined 

above. In further embodiments, the compounds for use in the methods 
provided herein include a mass modifying tag and have the formula: 
N 1 m -B r NV(S 1 ) r M(R l5 ) a -(S 2 ) b -L-T-X 

in which N\ B, N 2 , S\ M, S 2 , L, T, X, m, i, n, t, a and b are as defined 
15 above. 

In other embodiments, including those where Z is not a cleavable 
linker, the compounds provided herein have the formula: 
N l m -B i -N 2 n -(S 1 ) t -M(R 1 5 ) a -(S 2 ) b -X 

in which N\ B, N 2 , S 1 , M, S 2 , X, m, i, n, t, a and b are as defined above. 

20 D. Preparation of the Compounds 

The preparation of the compounds is described below. Any 
compound or similar compound may be synthesized according to a 
method discussed in general below or by only minor modification of the 
methods by selecting appropriate starting materials. 

25 On general, the compounds may be prepared starting with the 

central moiety Z. In certain embodiments, Z is -(S l ) r M(R 15 ) a -(S 2 ) b -L-. In 
these embodiments, the compounds may be prepared starting with an 
appropriately substituted (e.g., with one or more R 15 groups) M group. 
M(R 15 ) a is optionally linked with S 1 and/or S 2 , followed by linkage to the 
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cleavable linker L. Alternatively, the L group is optionally linked to S 2 , 
followed by reaction with M(R 15 ) a , and optionally S 1 . This Z group is then 
derivatized on its S 1 (or M(R 15 ) a ) terminus to have a functionality for 
coupling with an oligonucleotide or oligonucleotide analog Q {e.g., a 
5 phosphoramidite, H-phosphonate, or phosphoric triester group). The Q 
group will generally be N-protected on the bases to avoid competing 
reactions upon introduction of the X moiety. In one embodiment, the Z 
group is reacted with a mixture of all possible permutations of an 
oligonucleotide or oligonucleotide Q (e.g., 4 j permutations where i is the 

10 number of nucleotides or nucleotide analogs in B). The resulting Q-Z 

compound or compounds is(are) then derivatized through the L terminus 
to possess an X group for reaction with a biomolecule, such as a protein. 
If desired, the N-protecting groups on the Q moiety are then removed. 
Alternatively, the N-protecting groups may be removed following reaction 

15 of the compound with a biomolecule, including a protein. In other 
embodiments, Q can be synthesized on Z, including embodiments an 
insoluble support or substrate, such as a bead. In a further embodiment, 
Q is presynthesized by standard solid state techniques, then linked to M. 
Alternatively, Q may be synthesized stepwise on the M moiety. 

20 Provided below are examples of syntheses of the compounds 

provided herein containing alkaline-labile and photocleavable linkers. One 
of ordinary skill in the art would be able to prepare other compounds 
within the scope of this disclosure by routine modification of the methods 
presented below, or by other methods well known to those of skill in the 

25 art. 

For synthesis of a compound provided herein containing an alkaline 
labile linker, 1 ,4-di(hydroxymethyl)benzene (i.e., M) is mono-protected, 
e.g., as the corresponding mono-terf-butyldimethylsilyl ether. The 
remaining free alcohol is derivatized as the corresponding 2-cyano-ethyl- 
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10 



/V,/V-diisopropylphosphoramidite by reaction with 2-cyanoethyl-/V,/V- 
diisopropylchlorophosphoramidite. Reaction of this amidite with an 
oligonucleotide, (i.e., Q), is followed by removal of the protecting group 
to provide the corresponding alcohol. Reaction with, e.g., trichloromethyl 
chloroformate affords the illustrated chloroformate (i.e., X). 



(T^V^OTBDMS 



HO 




N 



(T^V^OTBDMS 



15 I 




oligonucleotide oligonucleotide 



For the synthesis of a compound provided herein containing a 
20 photocleavable linker, 2-nitro-5-hydroxybenzaldehyde (i.e., a precursor of 
L) is reacted with, e.g., 3-bromo-1-propanol to give the corresponding 
ether-alcohol. The alcohol is then protected, e.g., as the corresponding 
fert-butyldimethylsilyl ether. Reaction of this compound with 
trimethylaluminum gives the corresponding benzyl alcohol, which is 
25 derivatized as its phosphoramidite using the procedure described above. 
The amidite is reacted with an oligonucleotide (i.e., Q), followed by 
removal of the protecting group and derivatization of the resulting alcohol 
as the corresponding chloroformate (i.e., X). 

30 



35 
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NC 



CH 3 O 



O oligonucleotide 



CH, O 
3 i 

HO ^oligonucleotide 



For the synthesis of the compounds provided herein containing an 
acid labile linker, e.g.. a heterobifunctional trityl ether, the requisite 
phosphoramidite trityl ether is reacted with the oligonucleotide or 
5 oligonucleotide analog Q, followed by deprotection of the trityl ether and 
capture of a biomolecule, e.g., a protein, on the alcohol via a reactive 
derivative of the alcohol (X), as described above. 

10 
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Those of skill in the art will appreciate that the above syntheses of 
the compounds provided herein are exemplary only. Other syntheses of 
the compounds provided herein can be envisioned. Also, one of skill in 
5 the art will be able to modify the above syntheses in a routine manner to 
synthesize other compounds within the scope of the instant disclosure. 
E. Methods of Use of the Arrays 

1 . General methods 

The arrays provided herein may be used for the analysis, 
10 quantification, purification and/or identification of the components of 

biomolecule mixtures, including, but not limited to, protein mixtures. To 
initiate the analytical process, these mixtures are pre-purified according to 
standard procedures. In one embodiment, proteins are isolated from 



WO 03/077851 



PCT7US03/07479 



-78- 

biological fluids by cell lysis followed by either precipitation methods 
{e.g., ammonium sulfate) or enzymatic degradation of the nucleic acids 
and carbohydrates (if necessary) and the low molecular weight material is 
removed by molecular sieving. Proteins can also be obtained from 
5 expression libraries. Aliquots of the protein mixture are reacted with the 
compounds provided herein, having different functionalities X to 
segregate the mixture into separate protein families according to the 
selected reactivity of X. The diversity of B is selected depending on the 
complexity of the mixture of proteins. Hence, there are sets of 

10 compounds differing in X and B which can be selected for the analysis. 
In certain embodiments, the analysis is conducted using the smallest 
possible number of reactions necessary to completely analyze the 
mixture. Thus, in these embodiments, selection of the diversity of B and 
of the number of X groups of different reactivity will be a function of the 

15 complexity of the biomolecular mixture to be analyzed. Minimization of 
the diversity of B and the number of X groups allows for complete 
analysis of the mixture with minimal complexity. 

The separation of proteins from a complex mixture is achieved by 
virtue of the compound-protein products being bound to an array of 

20 complementary sequence. The supernatant, which contains the 

compound-protein products, is contacted with and allowed to hybridize to 
an array of complementary sequences. In one embodiment, a flat solid 
support which carries at spatially distinct locations, an array of 
oligonucleotides or oligonucleotide analogs that is complementary to the 

25 selected N 1 m -B r N 2 n oligonucleotide or oligonucleotide analog, is hybridized 
to the compound-protein products. 

In embodiments where Z is an insoluble support or substrate, such 
as a bead, separation of the compound-protein products into an 
addressable array may be achieved by sorting into an array of microwell 



WO 03/077851 



PCT/US03/07479 



-79- 

or microtiter plates, or other microcontainer arrays. In certain 
embodiments, the microwell or microtiter plates, or microontainers, 
include single stranded oligonucleotides or oligonucleotide analogs that 
are complementary to the oligonucleotide or oligonucleotide analog Q. 
5 After reaction or complexation of the compounds with the proteins, 

any excess compounds can be removed by adding a reagent designed to 
act as a "capturing agent." For example, a biotinylated small molecule, 
which has a functionality identical or similar to that which reacted with 
the selected X, is allowed to react with any excess compound. Exposure 

10 of this mixture to streptavidin bound to a magnetic bead, allows for 
removal of the excess of the compound. 

Hybridization of the compound-protein products to a 
complementary sequence is effected according to standard conditions 
(e.g., in the present of chaotropic salts to balance T m values of the 

15 various hybrids). Any non-hybridized material can be washed off and the 
hybridized material analyzed. 

In other embodiments, selective pooling of the products of different 
X moiety-containing reagents (e.g., amino- and thiol-reactive X groups; 
antibody and amino-reactive X groups; antibody and lectin X groups, etc.) 

20 may be performed for combined analysis on a single assay (e.g., on a 
single chip). 

Figure 1 shows a method to separate and analyze a complex 
mixture of proteins by use of MALDI-TOF mass spectrometry. Exposure 
of a compound as described herein, to a mixture of biomolecules, 
25 including, but not limited to, proteins (P1 to P4), affords a compound- 
protein array (NA = oligonucleotide moiety or oligonucleotide analog 
moiety, L = cleavable linker, P = protein). Separation of the array is 
effected by hybridization of the Q portion of the array to a complementary 
sequence attached to a support, such as an oligonucleotide chip. The 
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proteins (PI to P4) are then analyzed by MALDI-TOF mass spectrometry. 

When the complexity of a mixture of biomolecules, including, but 
not limited to, proteins, is low f affinity chromatographic or affinity 
filtration methods can be applied to separate the compound-protein 
5 products from the protein mixture. If the proteins to be analyzed were 
fluorescently labeled prior to (or after) reaction with the compound but 
prior to hybridization, these labeled proteins could also be detected on the 
array. In this way the positions which carry a hybrid can be detected 
prior to scanning over the array with MALDI-TOF mass spectrometry and 
10 the time to analyze the array minimized. Mass spectrometers of various 
kinds can be applied to analyze the proteins (e.g., linear or with reflection, 
with or without delayed extraction, with TOF, Q-TOFs or Fourier 
Transform analyzer with lasers of different wavelengths and xy sample 
stages). 

15 Mass spectrometer formats for use herein are matrix assisted laser 

desorption ionization (MALDI), continuous or pulsed electrospray (ES) 
ionization, ionspray, thermospray, or massive cluster impact mass 
spectrometry and a detection format such as linear time-of-flight (TOF), 
reflectron time-of-flight, single quadruple, multiple quadruple, single 

20 magnetic sector, multiple magnetic sector, Fourier transform, ion 

cyclotron resonance (ICR), ion trap, and combinations thereof such as 
MALDI-TOF spectrometry. For example, for ES, the samples, dissolved 
in water or in a volatile buffer, are injected either continuously or 
discontinuously into an atmospheric pressure ionization interface (API) 

25 and then mass analyzed by a quadrupole. The generation of multiple ion 
peaks which can be obtained using ES mass spectrometry can increase 
the accuracy of the mass determination. Even more detailed information 
on the specific structure can be obtained using an MS/MS quadrupole 
configuration. 
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Methods for performing MALDI are well known to those of skill in 
the art. Numerous methods for improving resolution are also known. 
For example, resolution in MALDI TOF mass spectrometry can be 
improved by reducing the number of high energy collisions during ion 
5 extraction (see, e.g., Juhasz et al. (1996) Analysis, Anal. Chem. 

53:941-946, see also, e.g., U.S. Patent No. 5,777,325, U.S. Patent No. 
5,742,049, U.S. Patent No. 5,654,545, U.S. Patent No. 5,641,959, U.S. 
Patent No. 5,654,545, U.S. Patent No. 5,760,393 and U.S. Patent No. 
5,760,393 for descriptions of MALDI and delayed extraction protocols). 

10 In MALDI mass spectrometry, various mass analyzers can be used, 

e.g., magnetic sector/magnetic deflection instruments in single or triple 
quadrupole mode (MS/MS), Fourier transform and time-of-flight (TOF), 
including orthogonal time-of-flight (O-TOF), configurations as is known in 
the art of mass spectrometry. For the desorption/ionization process, 

15 numerous matrix/laser combinations can be used. Ion-trap and reflectron 
configurations can also be employed. 

MALDI-MS requires the biomolecule to be incorporated into a 
matrix. It has been performed on polypeptides and on nucleic acids 
mixed in a solid (i.e., crystalline) matrix. The matrix is selected so that it 

20 absorbs the laser radiation. In these methods, a laser, such as a UV or 
IR laser, is used to strike the biopolymer/matrix mixture, which is 
crystallized on a probe tip or other suitable support, thereby effecting 
desorption and ionization of the biopolymer. In addition, MALDI-MS has 
been performed on polypeptides, glycerol, and other liquids as a matrix. 

25 A complex protein mixture can be selectively dissected, and in 

taking all data together, completely analyzed through the use of 
compounds with different functionalities X. The proteins present in a 
mixture of biological origin can be detected because all proteins have 
reactive functionalities present on their surfaces. If at each position on 
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the compound-protein array, there is the same protein cleavable under the 
same conditions as L or is added without covalent attachment to the solid 
support and serving as an internal molecular weight standard, the relative 
amount of each protein (or peptide if the protein array was enzymatically 
5 digested) can be determined. This process allows for the detection of 
changes in expressed proteins when comparing tissues from healthy and 
disease individuals, or when comparing the same tissue under different 
physiological conditions (e.g., time dependent studies). The process also 
allows for the detection of changes in expressed proteins when 

10 comparing different sections of tissues (e.g., tumors), which may be 
obtained, e.g. , by laser bioposy. 

Protein-protein interactions and protein-small molecule (e.g., drug) 
interactions can be studied by contacting the compound-protein array 
with a mixture of the molecules of interest. In this case, a compound will 

15 be used which has no cleavable linkage L, or which has a linkage L that is 
stable under MALDI-TOF MS conditions. Subsequent scanning of the 
array with the mass spectrometer demonstrates which hybridized proteins 
of the protein array have effectively interacted with the protein or small 
molecule mixtures of interest. 

20 Analysis using the well known 2-hybrid methodology is also 

possible and can be detected via mass spectrometry. See, e.g., U.S. 
Patent Nos. 5,512,473, 5,580,721, 5,580,736, 5,955,280, 5,695,941. 
See also. Brent eta/. (1996) Nucleic Acids Res. 24(1 7^:3341 -3347. 
In the above embodiments, including those where Z contains a 

25 cleavable linkage, the compounds may contain a mass modifying tag. In 
these embodiments, the mass modifying tag is used to analyze the 
differences in structure (e.g., side chain modification such as 
phosphoylation or dephosphorylation) and/or expression levels of 
biopolymers, including proteins. In one embodiment, two compounds (or 
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two sets of compounds having identical permuted B moieties) are used 
that only differ in the presence or absence of a mass modifying tag {or 
have two mass tags with appropriate mass differences) . One compound 
(or one set of compounds) is (are) reacted with "healthy" tissue and the 
5 mass modified compound (s) are reacted with the "disease" tissue under 
otherwise identical conditions. The two reactions are pooled and 
analyzed in a duplex mode. The mass differences will elucidate those 
proteins that are altered structurally or expressed in different quantity in 
the disease tissue. Three or more mass modifying tags can be used in 

10 separate reactions and pooled for multiplex analysis to follow the 

differences during different stages of disease development (i.e., mass 
modifying tag 1 at time point 1 , mass modifying tag 2 at time point 2 
etc.), or, alternatively, to analyze different tissue sections of a disease 
tissue such as a tumor sample. 

15 In further embodiments, selectivity in the reaction of the 

compounds provided herein with a biopolymer, including a protein, 
mixture can also be achieved by performing the reactions under kinetic 
control and by withdrawing aliquots at different time intervals. 
Alternatively, different parallel reactions can be performed (all differing 

20 in the B moiety of the Q group) and either performed with different 

stochiometric ratios or stopped at different time intervals and analyzed 
separately. 

In embodiments where the compounds provided herein possess a 
luminescent or colorimetric group, the immobilized compound-biomolecule 
25 conjugate may be viewed on the insoluble support prior to analysis. 

Viewing the conjugate provides information about where the conjugate 
has hybridized (such as for subsequent MALDI-TOF mass spectrometric 
analysis). In certain embodiments, with selected reagents the quantity of 
a given protein from separate experiments {e.g., healthy vs. disease, time 



WO 03/077851 



PCT/US03/07479 



-84- 



point 1 vs. time point 2, etc.) may be determined by using dyes which 
can be spectrophotometrically differentiated. 

In another embodiment, the methods are performed by tagging the 
biopolymers to be analyzed, including but not limited to proteins, with 
5 more than one, in one embodiment three to five, of the compounds 

provided herein. Such compounds would possess functionality designed 
to target smaller chemical features of the biomolecules rather than a 
macromolecular feature. See, e.g., Figure 8. Such smaller chemical 
features include, but are not limted to, NH 2 , SH, SS (after capping SH, SS 

10 may be targeted by, e.g., gold), and OH. In one non-limiting example, the 
phenolic OH of tyrosine is selectively captured using a diazo compound, 
such as an aryldiazonium salt. In this embodiment, the reaction may be 
performed in water. For example, a functionalized diazonium salt could 
be used where the functionality allows for subsequent capture of a 

15 compound provided herein, thereby providing a oligonucleotide-labelled 
biomolecule. One such functionalized diazonium salt is: 



A biomolecule modified with this reagent is then labelled with an 
oligonucleotide possessing a diene residue. It is appreciated by those of 
skill in the art that many reagent couples other that dienophile/diene may 
be used in these embodiments. In the case of dienophile/diene, the 
30 reaction of the dienophile with the diene may be performed in the 
presence of many other functional groups, including N- 
hydroxysuccinimido-activated oligonucleotides reacting with an NH 2 



20 
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group. Thus, these two labelling specific reactions may be performed 
simultaneously (i.e., in one reaction mixture). See, e.g., Figure 11. 

Subsequently, the multiply-tagged biomolecules are hybridized on 
an array of antisense oligonucleotides, in one embodiment a chip 
5 containing an array of antisense oligonucleotides. In general, the 

specificity of hybridization increases using compounds provided herein 
where Z is a dendrimer. See, e.g., Figure 9. In one embodiment, Z is a 
dendritic structure containing up to about 6 branches. In this 
embodiment, the methods provided herein allow for separation between 

10 biomolecules labelled with, e.g., fibe oligotags, where four are similar and 
one is different. See, e.g., Figure 10. 

In embodiments where the compounds for use in the methods 
provided herein are insoluble or poorly soluble in water or aqueous 
buffers, organic solvents are added to the buffers to improve solubility. 

15 In one embodiment, the ratio of buffenorganic solvent is such that 

denaturation of the biomolecule does not occur. In another embodiment, 
the organic solvents used include, but are not limited to, acetonitrile, 
formamide and pyridine. In another embodiment, the ratio of 
bufferrorganic solvent is about 4:1 . To determine if an organic co-solvent 

20 is needed, the rate of reaction of the compounds provided herein with a 
water-souble amine, such as 5'-aminothymidine, is measured. For 
example, the following reaction is performed is a variety of solvent 
mixtures well known to those of skill in the art to determine optimal 
conditions for subsequent biomolecular tagging and analysis: 

25 



30 
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2. Phenotype analyses 

The arrays permit a top down holistic approach to analysis of the 
25 proteome and other biomolecules. As noted, the arrays and methods of 
use provide an unbiased way to analyze biomolecules, since the methods 
do not necessarily assess specific classes of targets, but rather detect or 
identify changes in the samples. The changes identified include structural 
changes that are related to the primary sequences and modifications, 
30 including post-translational modifications. In addition, since the capture 
compounds can include a solubility function they can be designed for 
reaction in hydrophobic conditions, thereby permitting analysis of 
membrane-bound and membrane-associated molecules, particularly 
proteins. 

35 Problems with proteome analysis arise from genetic variation that is 

not related to a target phenotype, proteome variation due to differences, 
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such as gender, age, metabolic state, the complex mixtures of cells in 
target tissues and variations from cell cycle stage. Thus, to identify or 
detect changes, such as disease-related changes, among the biomolecule 
components of tissues and cells, homogeneity of the sample can be 
5 important. To provide homogeneity, cells, with different phenotypes, 
such as diseased versus healthy, from the same individual are compared. 
As a result, differences in patterns of biomolecules can be attributed to 
the differences in the phenotype rather than from differences among 
individuals. Hence, samples can be obtained from a single individual and 

10 cells with different phenotypes, such as healthy versus diseased and 
responders versus non-responders, are separated. In addition, the cells 
can be synchronized or frozen into a metabolic state to further reduce 
background differences. 

Thus, the arrays can be used to identify phenotype-specific 

15 proteins or modifications thereof or other phenotype-specific biomolecules 
and patterns thereof. This can be achieved by comparing biomolecule 
samples from cells or tissues with one phenotype to the equivalent cells 
to biomolecule samples form cells or tissues with another phenotype. 
Phenotypes in cells from the same individual and cell type are cpmpared. 

20 In particular, primary cells, primary cell culture and/or synchronized cells 
are compared. The patterns of binding of biomolecules from the cells to 
capture compound members of the collection can be identified and used 
as a signature or profile of a disease or healthy state or other phenotypes. 
The particular bound biomolecule, such as protein, proteins also can be 

25 identified and new disease-associated markers, such as particular proteins 
or structures thereof can be identified. Example 6 provides an exemplary 
embodiment in which cells are separated. See also Figure 19. 
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Phenotypes for comparison include, but are not limited to: 

1) samples from diseased versus healthy cells or tissues to identify 

proteins or other biomolecules associated with disease or that are markers 

for disease; 

5 2) samples from drug responders and non responders (i.e. on 20- 

r 

30% of malignant melanoma patients respond to alpha interferon and 
others to do not) to identify biomolecules indicative of response; 

3) samples from cells or tissues with a toxicity profile to drugs or 
environmental conditions to identify biomolecules associated with the 

10 response or a marker of the response; and 

4) samples from cells or tissues exposed to any condition or 
exhibiting any phenotype in order to identify biomolecules, such as 
proteins, associated with the response or phenotype or that are a marker 
therefor. 

15 Generally the samples for each phenotype are obtained from the 

same organism, such as from the same mammal so that the cells are 
essentially matched and any variation should reflect variation due to the 
phenotype not the source of the cells. Samples can be obtained from 
primary cells (or tissues). In all instances, the samples can be obtained 

20 from the same individual either before exposure or treatment or from 

healthy non-diseased tissue in order to permit identification of phenotype- 
associated biomolecules. 

Cells can be separated by any suitable method that permits 
identification of a particular phenotype and then separation of the cells 

25 based thereon. Any separation method, such as, for example, panning, 
negative panning-where unwanted cells are captured and the wanted cells 
remain in the supernatant) where the live cells are recovered can be used. 
These methods include, but are not limited to: 
1 ) flow cytometry; 
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2) specific capture; 

3) negative panning in which unwanted cells are captured and the 
targeted cells remain in the supernatant and live cells are recovered for 
analysis; and 

5 4) Laser Capture Microdissection (LCM) (Arcturus, Inc Mountain 

View, CA). 

Thus sorting criteria include, but are not limited to, membrane 
potential, ion flux, enzymatic activity, cell surface markers, disease 
markers, and other such criteria that permit separation of cells from an 
10 individual based on phenotype. 

a) Exemplary separation methods 

1) Laser Capture Microdissection 
Laser Capture Microdissection (LCM) (Arcturus, Inc Mountain View, 
CA) uses a microscope platform combined with a low-energy IR laser to 
15 activate a plastic capture film onto selected cells of interest. The cells are 
then gently lifted from the surrounding tissue. This approach precludes 
any absorption of laser radiation by microdissected cells or surrounding 
tissue, thus ensuring the integrity of RNA, DNA, and protein prepared 
from the microdissected samples for downstream analysis. 
20 2) Flow cytometry for separation 

Flow cytometry is a method, somewhat analogous to fluorescent 
microscopy, in which measurements are performed on particles (cells) in 
liquid suspension, which flow one at a time through a focused laser beam 
at rates up to several thousand particles per second. Light scattered and 
25 fluorescence emitted by the particles (cells) is collected, filtered, digitized 
and sent to a computer for analysis. Typically flow cytometry measures 
the binding of a fluorochrome-labeled probe to cells and the comparison 
of the resultant fluorescence to the background fluorescence of 
unstained cells. Cells can be separated using a version of flow 
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cytometry, flow sorting, in which the particles (cells) are separated and 
recovered from suspension based upon properties measured in flow. Cells 
that are recovered via flow sorting are viable and can be collected under 
sterile conditions. Typically recovered subpopulations that are in excess 
5 of 99.5% pure (see Figures 19a and 19b). 

Flow cytometry allows cells to be distingused using various 
parameters including physical and/br chemical characteristics associated 
with cells or properties of cell-associated reagents or probes, any of 
which are measured by instrument sensors. Separation: Live v. Dead 
10 Forward and side scatter are used for preliminary identification and 

gating of cell populations. Scatter parameters are used to exclude debris, 
dead cells, and unwanted aggregates. In a peripheral blood or bone 
marrow sample, lymphocyte, monocyte and granulocyte populations can 
be defined, and separately gated and analyzed, on the basis of forward 
15 and side scatter. Cells that are recovered via flow sorting are viable and 
can be collected under sterile conditions. Typically recovered 
subpopulations are in excess of 99.5% pure. 

Common cell sorting experiments usually involve 
immunofluorescence assays, i.e., staining of cells with antibodies 
20 conjugated to fluorescent dyes in order to detect antigens. In addition, 
sorting can be performed using GFP-reporter constructs in order to isolate 
pure populations of cells expressing a given gene/construct. 

a. Fluorescence 
Fluorescent parameter measurement permits investigation of cell 
25 structures and functions based upon direct staining, reactions with 
fluorochrome labeled probes (e.g., antibodies), or expression of 
fluorescent proteins. Fluorescence signals can be measured as single or 
multiple parameters corresponding to different laser excitation and 
fluorescence emission wavelengths. When different fluorochromes are 
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used simultaneously, signal spillover can occur between fluorescence 
channels. This is corrected through compensation. Certain combinations 
of fluorochromes cannot be used simultaneously; those of skill in the art 
can identify such combinations. 
5 b. Immunofluorescence 

Immunofluorescence involves the staining of cells with antibodies 
conjugated to fluorescent dyes such as FITC (fluorescein), PE 
(phycoerythrin), APC (allophycocyanin), and PE-based tandem 
conjugates (R670, Cy Chrome and others.). Cell surface antigens are the 

10 usual targets of this assay, but antibodies can be directed at antigens or 
cytokines in the cytoplasm as well. 

DNA staining is used primarily for cell cycle profiling, or as one 
method for measuring apoptosis. Propidium iodide (PI), the most 
commonly used DNA stain, cannot enter live cells and can therefore be 

15 used for viability assays. For cell cycle or apoptosis assays using PI, cells 
must first be fixed in order for staining to take place (see protocol). The 
relative quantity of PI-DNA staining corresponds to the proportion of cells 
in G0/G1, S, and G2/M phases, with lesser amounts of staining indicating 
apoptotic/necrotic cells. PI staining can be performed simultaneously with 

20 certain fluorochromes, such as FITC and GFP, in assays to further 
characterize apoptosis or gene expression. 

Gene Expression and Transfection can be measured indirectly by 
using a reporter gene in the construct. Green Fluorescent Protein-type 
constructs (EGFP, red and blue fluorescent proteins) and B-galactosidase, 

25 for example, can be used to quantify populations of those cells 

expressing the gene/construct. Mutants of GFP are now available that 
can be excited at common frequencies, but emit fluorescence at different 
wavelengths. This allows for measurement of co-transfection, as well as 
simultaneous detection of gene and antibody expression. Appropriate 
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negative (background) controls for experiments involving GFP-type 
constructs should be included. Controls include, for example, the same 
cell type, using the gene insert minus the GFP-type construct. 

3) Metabolic Studies and other studies 
5 Annexin-V can be labeled with various fluorochromes in order to identify 
cells in early stages of apoptosis. CFSE binds to cell membranes and is 
equally distributed when cells divide. The number of divisions cells 
undergo in a period of time can then be counted. CFSE can be used in 
conjunction with certain fluorochromes for immunofluorescence. Calcium 

10 flux can be measured using lndo-1 markers. This can be combined with 
immunofluorescent staining. Intercellular conjugation assays can be 
performed using combinations of dyes such as calcein or hydroethidine. 

b) Synchronizing cell cycles 
Once sorted or separated cells are obtained they can be cultured, 

15 and, can be synchronized or frozen into a particular metabolic state. This 
enhances the ability to identify phenotype-specific biomolecules. Such 
cells can be separated by the above methods, including by flow 
cytometry. Further, cells in the same cell cycle, same metabolic state or 
other synchronized state can be separated into groups using flow 

20 cytometry (see, Figure 19c). 

Cell cycles can be synchronized or frozen by a variety of methods, 
including but are not limited to, cell chelation of critical ions, such as by 
removal of magnesium, zinc, manganese, cobal and/or other ions that 
perform specific functions by EDTA or otherchelators (see, e.g., 

25 EXAMPLES). Other methods include controlling various metabolic or 

biochemical pathways. Figure 18 depicts exemplary points of regulation 
of metabolic control mechanisms for cell synchronization. Examples of 
synchronizing or "freezing'' Metabolic Control for synchronizing cells, 
include, but are not limited to, the following: 
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1 ) control of gene expression; 

2) regulation of enzyme reactions; 

3) negative control: Feedback inhibition or End product repression 
and enzyme induction are mechanisms of negative control that lead to a 

5 decrease in the transcription of proteins; 

4) positive control: catabolite repression is considered a form of 
positive control because it affects an increase in transcription of proteins. 

5) Control of individual proteins translation: 

a) oligonucleotides that hybridize to the 5' cap site have 
10 inhibit protein synthesis by inhibiting the initial interaction between the 

mRNA and the ribosome 40S sub-unit; 

b) oligonucleotides that hybridize to the 5' UTR up to, and 
including, the translation initiation codon inhibit the scanning of the 40S 
(or 30S) subunit or assembly of the full ribosome (80S for eukaryotes or 

15 70S for bacterial systems); 

5) control of post translational modification: 

6) control of allosteric enzymes, where the active site binds to the 
substrate of the enzyme and converts it to a product. The allosteric site 
is occupied by some small molecule that is not a substrate. If the protein 

20 is an enzyme, when the allosteric site is occupied, the enzyme is inactive, 
i.e., the effector molecule decreases the activity of the enzyme. Some 
multicomponent allosteric enzymes have several sites occupied by various 
effector molecules that modulate enzyme activity over a range of 
conditions. 

25 3. Analysis of low abundancy proteins 

Important disease-associates markers and targets could be low 
abundancy proteins, that might not be detected by mass spectrometry. 
To ensure detection, a first capture compound display experiment can be 
performed. The resulting array of captured proteins is reacted with a non- 
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selective dye, such as a fluorescent dye, that will light up or render visible 
more proteins on the array. The dye can provide ae semi-quantitative 
estimate of the amount of a protein. The of different proteins detected by 
the dye can be determined and then compared the number detected by 
5 mass spectrometric analysis. If there are more proteins detected using 
the dye, the experiments can be repeated using a higher starting number 
of cells so that low abundance proteins can be detected and identified by 
the mass spectrometric analysis. 

For example, housekeeping proteins, such as actin and other such 

10 proteins, are present in high abundance and can mask low abundancy 

proteins. Capture compounds or other purification compound selected or 
designed to capture or removethe high abundancy proteins or 
biomolecules from a mixture before using a collection to asssess the 
components of the mixtuer. Once the high abundancy proteins are 

15 removed, low abundancy proteins have an effectivly higher concentration 
and can be detected. These methods, thus, have two steps: a first step 
to capture high abundancy components of biomolecule mixtures, such as 
the actins. For example, a cell lysate can be contacted with capture 
molecules that include a reactivity group such as biotin or other general 

20 reactivity function linked to a sorting group to remove such high 
abundancy proteins, and then use a suitable collection of capture 
compounds to identify lower abundancy compounds remaining in the 
lysate. 

Also, as discussed above, capture compounds can be 
25 designed, such as by appropriate selection of W, to interact intact with 
intact organelles before disrupting them in cells that have been gently 
lysed or otherwise treated to permit access to organelles and internal 
membraes. Then the captured organelles can be disrupted, such as on 
which can inlcude an artificial membrane, such as lipid bilayer or micelle 
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coating, to capture the organelle proteins and other biomolecules in an 
environment that retains their three-dimensional sturcture. These 
captured proteins can be analyzed. This permits the capture compounds 
to interact with the captured proteins and other biomolecules in thier 
5 native tertiary structure. 

4. Monitoring protein conformation as an indicator of disease 
The arrays and/or loci thereof can be used to detect or distinguish 
specific conformers of proteins. Hence, for example, if a particular 
conformation of a protein is associated with a disease (or healthy state) 

10 the arrays or member loci thereof can detect one conformer or distinguish 
conformers based upon a patter of binding to the capture compounds in a 
collection. Thus, the arrays and/or members thereof cna be used to 
detect conformationally altered protein diseases (or diseases of protein 
aggregation), where a diseases-associated protein or polypeptide has a 

15 disease-associated conformation. The methods and arrays provided 
herein permit detection of a conformer associated with a disease to be 
detected. These diseases include, but are not limited to, amyloid diseases 
and neurodegenerative diseases. Other diseases and associated proteins 
that exhibit two or more different conformations in which at least one 

20 conformation is with disease, include those set forth in the following 
Table: 



Disease 


Insoluble protein 


Alzheimer's Disease (AD) 


APP, Afi r al-antichymotrypsin, tau, non-A/? 
component, presenellin 1, presenellin 2, 
apoE 


Prion diseases, including but are not 
limited to, Creutzfeldt-Jakob disease, 
scrapie, bovine spongiform i 
encephalopathy 


Prp sc 


amyotrophic lateral sclerosis (ALS) 


superoxide dismutase (SOD) and 
neurofilament 


Pick's Disease 


Pick body 
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Disease 


Insoluble protein 


Parkinson's disease 


a-synuclein in Lewy bodies 


Frontotemporal dementia 


tau in fibrils 


Diabetes Type II 


amylin 


Multiple myeloma 


IgGL-chain 


Plasma cell dyscrasias 




Familial amyloidotic polynueuropathy 


Transthyretin 


Medullary carcinoma of thyroid 


Procalcitonin 


Chronic renal failure 


/? 2 -microgobulin 


Congestive heart failure 


Atrial natriuretic factor 


Senile Cardiac and systemic 
amyloidosis 


transthyretin 


Chronic inflammation 


Serum Amyloid A 


Atherosclerosis 


ApoAl 


Familial amyloidosis 


Gelsolin 


Huntington's disease 


Huntington 



The arrays can be contacted with a mixture of the conformers and 

the members that bind or retain each form can be identified, and a pattern 

thus associated with each conformer. Alternatively, those that bind to 

20 only one conformer, such as the conformer associated with disease can 

be identified, and sub-collections of one or more of such arrays can be 

used as a diagnostic reagent for the disease. 

5. Small molecule identification and biomolecule-biomolecule 
interaction investigation 

25 Biomolecules, such as proteins, are sorted using a covalent or 

noncovalent interaction with immobilized capture compounds. Arrays of 

bound to biomolecules, such as from cell lysates, then can be used to 

screen libraries or other mixtures of drug candidates or to further screen 

mixtuers of biomolecules to see what binds to the bound biomolecules. 
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The capture biomolecule-biomolecule complexes or biomolecule-drug 
candidate complexes can be analyzed to identify biochemical pathways 
and also to identify targets with the candidate drug. 

For example, protein-protein or protein-biomolecuie interactions are 
5 exposed to test compounds, typically small molecules, including small 
organic molecules, peptides, peptide mimetics, antisense molecules or 
dsRNA, antibodies, fragments of antibodies, recombinant and sythetic 
antibodies and fragments thereof and other such compounds that can 
serve as drug candidates or lead compounds. Bound small molecules are 

* 

10 identified by mass spectrometry or other analytical methods. 
F. Systems 

In further embodiments, the compounds and the methods described 
herein are designed to be placed into an integrated system that 
standardizes and automates the following process steps: 
15 • Isolation of biomolecules from a biological source, including 

isolation of the proteins from cell lysates (lysis, enzymatic 
digestion, precipitation, washing) 

• Optionally, removal of low molecular weight materials 

• Optionally, aliquoting the biomolecule mixture, such as a 
20 protein mixture 

• Reaction of the biomolecule mixture, such as a protein 
mixture, with compounds of different chemical reactivity (X) 
and sequence diversity (B) provided herein; this step can be 
performed in parallel using aliquots of the biomolecule 

25 mixture 

• Optionally, removal of excess compound 

• Hybridization of the compound-biomolecule conjugate, such 
as a compound-protein conjugate to single stranded 
oligonucleotides or oligonucleotide analogs that are 
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complementary to the Q moiety of the compound; the single 
stranded oligonucleotides or oligonucleotide analogs are 
optionally presented in an array format and are optionally 
immobilized on an insoluble support 
5 • Optionally, subsequent chemical or enzymatic treatment of 

the protein array 
• Analysis of the biomolecule array, including, but not limited 
to, the steps of (i) deposition of matrix, and (ii) spot-by-spot 
MALDI-TOF mass spectrometry using an array mass 
10 spectrometer (with or without internal, e.g., on-chip 

molecular weight standard for calibration and quantitation). 
The system includes the collections provided herein, optionally 
arrays of such collections, software for control of the processes of 
sample preparations and instrumental analyis and for analysis of the 
15 resulting data, and instrumentation, such as a mass spectrometer, for 

analysis of the biolmolecules. The system include other devices, such as 
a liquid chromatographic devices so that a protein mixture is at least 
partially separated. The eluent is collected in a continuous series of 
aliquots into, e.g., microtiter plates, and each aliquot reacted with a 
20 capture compound provided. 

In multiplex reactions, aliquots in each well can simultaneously 
react with one or more of the capture compounds provided herein that, 
for example each differ in X (i.e., amino, thiol, lectin specific functionality) 
with each having a specific and differentiating selectivity moiety Y and in 
25 the Q group. Chromatography can be done in aqueous or in organic 
medium. The resulting reaction mixtures are pooled and analyzed 
directly. Alternatively, subsequent secondary reactions or molecular 
interaction studies are performed prior to analysis, including mass 
spectrometric analysis. 
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The systems provided herein can contains an assembly line, such 
as pipetting robots on xy stages and reagent supply/washing modules are 
linked with a central separation device and a terminal mass spectrometer 
for analysis and data interpretation. The systems can be programmed to 
5 perform process steps including (see, e.g., FIG. 2), for example: 

1) Cell cultures (or tissue samples) are provided in microtiter 
plates (MTPs) with 1, 2.. A wells. To each well, solutions are 
added for lysis of cells, thereby liberating the proteins. In 
some embodiments, appropriate washing steps are included, 

10 as well as addition of enzymes to digest nucleic acids and 

other non-protein components. In further embodiments, 
instead of regular MTPs, MTPs with filter plates in the 
bottom of wells are used. Cell debris is removed either by 
filtration or centrifugation. A conditioning solution for the 

15 appropriate separation process is added and the material 

from each well separately loaded onto the separation device. 

2) Separation utilizes different separation principles such as 
charge, molecular sizing, adsorption, ion-exchange, and 
molecular exclusion principles. Depending on the sample 

20 size, suitable appropriate dimensions are utilized, such as 

microbore high performance liquid chromatography (HPLC). 
v In certain embodiments, a continuous flow process is used 

and the effluent is continuously aliquotted into MTP 1,2...n. 

3) Reaction with Proteome Reagents. Each MTP in turn is 

25 transferred to a Proteome Reagent Station harboring 1, 2... 

m reagents differing only in the oligonucleotide sequence 
part {i.e., Q) or/and in the chemical nature of the 
functionality reacting with the proteins (i.e., X). If there are 
more than one MTP coming from one tissue sample then 



PCT/US03/07479 



-100- 

reagent 1 is added to the same well of the respective MTPs 
1, 2...n, i.e., in well A1, reagent 2 in well A2, etc. In 
embodiments where the MTPs have 96 wells (i = 1-96), 96 
different Proteome Reagents (i.e., 96 different compounds 
provided herien, m = 1-96) are supplied through 96 different 
nozzles from the Proteome Reagent Station to prevent cross- 
contamination. 

Pooling: Excess Proteome Reagent is deactivated, aliquots 
from each well belonging to one and the same tissue 
samples are pooled, and the remaining material is stored at 
conditions that preserve the structure (and if necessary 
conformation) of the proteins intact, thereby serving as 
master MTPs for subsequent experiments. 
Excess Proteome Reagent is removed in the pooled sample 
using, e.g., the biotin/streptavidin system with magnetic 
beads, then the supernatant is concentrated and conditioned 
for hybridization. 

Transfer to an Oligonucleotide Chip. After a washing step to 
remove non-hybridized and other low molecular weight 
material, a matrix is added. Alternatively, before matrix 
addition, a digestion with, e.g., trypsin or/and chymotrypsin 
is performed. After washing out the enzyme and the 
digestion products, the matrix is added. 
Transfer of chip to mass spectrometer. In one embodiment, 
MALDI-TOF mass spectrometry is performed. Other mass 
spectrometric configurations suitable for protein analysis also 
can be applied. The mass spectrometer has a xy stage and 
thereby rasters over each position on the spot for analysis. 
The Proteome Reagent can be designed so that most of the 
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reagent part (including the part hybridizing with the 
oligonucleotide chip array) is cleaved either before or during 
mass spectrometry and therefore will be detected in the low 
molecular weight area of the spectrum and therefore well be 
5 well separated from the peptide {in case of enzymatic 

digestion) or protein molecular weight signals in the mass 
spectrum. 

8) Finally, the molecular weight signals can be processed for 
noise reduction, background subtraction and other such 
10 processing steps. The data obtained can be archived and 

interpreted. The molecular weight values of the proteins (or 
the peptides obtained after enzymatic digestion) are 
associated with the human DNA sequence information and 
the derived protein sequence information from the protein 
15 coding regions. An interaction with available databases will 

reveal whether the proteins and their functions are already 
known. If the function is unknown, the protein can be 
expressed from the known DNA sequence in sufficient scale 
using standard methods to elucidate its function and 
20 subsequent location in a biochemical pathway, where it plays 

its metabolic role in a healthy individual or in the disease 
pathway for an individual with disease. 
Since the master plates containing aliquots from the different 
proteins within a given tissue sample have been stored and are available, 
25 ubsequent experiments then can be performed in a now preselected way, 
e.g., the proteins are displayed on the chip surface for protein-protein 
(biomolecule) interaction studies for target validation or/and to study the 
interaction with combinatorial libraries of small molecules for drug 
candidate selection. 
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G. Bioinformatics 

The raw data generated from the mass spectrometry analysis of the 
compound-protein species is processed by background subtraction, noise 
reduction, molecular weight calibration and peak refinement [e.g., peak 
5 integration). The molecular weight values of the cleaved proteins or the 
digestion products are interpreted and compared with existing protein 
data bases to determine whether the protein in question is known, and if 
so, what modifications are present (glycosylated or not glycosylated, 
phosphorylated or not phosphorylated, etc.). The different sets of 

10 experiments belonging to one set of compounds are composed, compared 
and interpreted. For example, one set of experiments uses a set of 
compounds with one X moiety and different Q moieties. This set of 
experiments provides data for a portion of the proteome, since not all 
proteins in the proteome will react with a given X moiety. Superposition 

15 of the data from this set of experiments with data from other sets of 
experiments with different X moieties provides data for the complete 
proteome. 

Sets of experiments comparing tissues of healthy and disease 
individuals or from different physiological or developmental stages [e.g., 
20 tumor progression, dependence of drug treatments to monitor result of 
therapy, immune response to virus or bacteria infection) or different 
tissues areas {e.g., of a tumor) are investigated, and the final data 
archived. 

25 The following examples are included for illustrative purposes only 

and are not intended to limit the scope of the invention. 
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EXAMPLE 1 

Examples for N 1 m -B,-N 2 n 

a. N 1 and N 2 as identical tetramers, B as a trimer 

N 1 = N 2 , m = n = 4, i = 3, B = 64 sequence permutations 
5 GTGC ATG GTGC 

A AG 
ACG 
AGG 
TTG 

10 CTG 

GTG 



15 GGG 

b. IM 1 and N 2 as non-identical tetramers, B as a tetramer 

N 1 4= N 2 , m = n = 4, i = 4, B = 256 sequence permutations 

GTCC ATCG CTAC 
AACG 

20 ACCG 

AGCG 



25 GGGG 
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10 



15 



20 



25 



30 



35 



c. N 1 as a heptamer, N 2 as an octamer, B as an octamer 

N 1 =|= N 2 , m = 7, n = 8, i = 8, B = 65,536 sequence 



permutations. 



GCTGCCC ATTCGTAC GCCTGCCC 



N 1 



B 



EXAMPLE 2 



N 



Separation of proteins on a DNA array 

N 1 m -Bi-N 2 n -(S 1 ) t -M(R ,5 ) a -{S 2 ) b -L-X-Protein where B is a trimer; 

m = n = 4, i = 3, t = b = 1 ; underlined sequences are N 1 and N 2 



.15, 



CTGC ATG GTGC - S, - M(FT) a - S 2 - L - X - Protein 1 
— CACGTACCACG 



.15, 



CTGC AAG GTGC - S., - M(R'°) a - S 2 - L - X - Protein 2 
•CACG TTC CACG 



.15, 



CTGC ACG GTGC - S, - M(R ) a - S 2 - L - X - Protein 3 
•CACG TGC CACG 



CTGC GGG GTGC - S, - M(R 15 ) 8 - S 2 - L-X - Protein 64 
•CACG CCC CACG 



EXAMPLE 3 

Synthesis of Glass-based Arrays: 

Q-Z-X 
I 

Y 

Glass slides wee silanized according to standard protocols using 
trimethoxyaminopropylsilane. The surface loading was boosted by first 
activating the slides with phanylenediisothiocyanate followed by 
treatment with 4th generation PAMAM (polyamidoamine) dedrimer (64 
amino groups)(see, e.g., http://www.dendritech.com/pamam.html). The 
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slides were then coupled to an 1 8-atom amphiphilic linker using a 
phosphoramidite chemistry. The derivatized slides were then subjected to 
a masking synthesis method to produce distinct patches using trebler 
phosphoramidites (see, Figure 4). Alternatively, the phosphoramidites are 
5 applied to create a continuous gradient in each dimension of the glass 
slide. Features that may be varied using this masking or gradient 
synthesis include, but are not limited to, hydrophobicity/hydrophilicity and 
positive/negative charge. 

EXAMPLE 4 

10 I. Preparation of protein mixtures from cells or via protein translation 

of a cDNA library prepared cell or tissues 

The protein mixtures can be selectively divided on the physical or 
biochemical separation techniques 

1 . Preparation of limited complexity protein pools using cell 
15 culture or tissue 

Proteins can be isolated from cell culture or tissues according to 

methods well known to those of skill in the art. The isolated proteins are 

purified using methods well known to those of skill in the art (e.g., TPAE, 

differential protein precipitation (precipitation by salts, pH, and ionic 

20 polymers), differential protein crystallization bulk fractionation, 

electrophoresis (PAGE, isoelectric focusing, capillary), and 

chromatography (immunoaffinity, HPLC, LQ). Individual column fractions 

containing protein mixtures of limited complexity are collected for use as 

antigen. 

25 2. Preparation of limited complexity protein pools using cDNA 

expression libraries with (Figure 13) 

a. RNA Isolation 

i. Isolation of Total RNA 

Cultured cells or tissues are homogenized in a denaturing solution 
30 containing 4 M guanidine thiocyanate. The homogenate is mixed 
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sequentially with 2 M sodium acetate (pH 4), phenol, and finally 
chloroform/isoamyl alcohol or bromochloropropane. The resulting mixture 
is centrifuged, yielding an upper aqueous phase containing total RNA. 
Following isopropanol precipitation, the RNA pellet is dissolved in 
5 denaturing solution {containing 4 M guanidine thiocyanate), precipitated 
with isopropanol, and washed with 75% ethanol. 

ii. Isolation of Cytoplasmic RNA 
Cells are washed with ice-cold phosphate-buffered saline and kept 
on ice for all subsequent manipulations. The pellet of harvested cells is 
10 resuspended in a lysis buffer containing the nonionic detergent Nonidet P- 
40. Lysis of the plasma membranes occurs almost immediately. The 
intact nuclei are removed by a brief micro centrifuge spin, and sodium 
dodecyl sulfate is added to the cytoplasmic supernatant to denature 
protein. Protein is digested with protease and removed by extractions 
15 with phenol/chloroform and chloroform. The cytoplasmic RNA is 
recovered by ethanol precipitation. 

b. mRIMA purification 
Messenger RNA is purified from total or cytoplasmic RNA 
preparation using standard procedures. Poly (A) + RNA can be separated 
20 from total RNA by oligo (dT) binding to the Poly(A) tail of the mRNA. 
Total RNA is denatured to expose the Poly (A) (polyadenylated) tails. 
Poly(A)-containing RNA is then bound to magnetic beads coated with 
oligo(dT) and spirited from the total or cytoplasmic RNA through magnetic 
forces. The mRNA population can be further enriched for the presence of 
25 full-length molecules through the selection of a 5'-cap containing mRNA 
species. 
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c. cDNA synthesis 
Different types of primers can be used to synthesis full length or 
5'-end containing cDNA libraries from the isolated mRNA. 

i. Oligo (dT) primer, that will generate cDNAs for 
5 all mRNA species (Figure 14) 

An example of the production of an adapted oligo dT primed cDNA 

library is provided in Figure 14. 

ii. Functional protein motif specific degenerated 
oligonucleotides these primers will generate a 

10 limited number of genes belonging to the same 

protein family or of functionally related proteins 
(Figure 15) 

An example of the production of an adapted sequence motif 

specific cDNA library is provided in Figure 15. 

15 iii. Gene specific oligonucleotide will produce cDNA 

for only one particular mRIMA species (Figure 
16) 

The oligonucleotides used for the cDNA production can contain 
additional sequences, 1) protein tag specific sequences for easier 

20 purification of the recombinant proteins (6x HIS Figure 7), 2) restriction 
enzyme sites, 3) modified 5'-end for cDNA purification or DNA 
construction purpose (Figure 17). 

The conversion of mRNA into double-stranded cDNA for insertion 
into a vector is carried out in two parts. First, intact mRNA hybridized to 

25 an oligonucleotide primer, is copied by reverse transcriptase and the 
products isolated by phenol extraction and ethanol precipitation. The 
RNA in the RNA-DNA hybrid is removed with RNase H as E. coli DNA 
polymerase I fills in the gaps. The second-strand fragments thus 
produced are ligated by E. coli DNA ligase. Second-strand synthesis is 

30 completed, residual RNA degraded, and cDNA made blunt with RNase H, 
RNase A, T4 DNA polymerase, and E. coli DNA ligase. 
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d. Adapter ligation 

Adapter molecules can be ligated to both ends of the blunt ended 
double stranded cDNA or to only one end of the cDNA. Site directed 
adapter ligation could be acheved through the use of 5' modified 
5 oligonucleotides (for example biotinylated, aminated) during cDNA 

synthesis that prevents adapter ligation to the 3' end of the cDNA. The 
resulting cDNA molecules contain a 5'-end cDNA library comprised of the 
5' non-translated region, the translational start codon AUG coding for a 
methionine, followed by the coding region of the gene or genes. The 
10 cDNA molecules are flanked by known DNA sequence on their 5'- and 3'- 
end (Figures 14, 15 and 16). 

e. cDNA amplification 

PCR Primers to the known 5'- and 3'-end sequences or known 
internal sequences can be synthesized and used for the amplification of 
15 either the complete library or specific subpopulations of cDNA using 
extended 5'- or 3'- amplification primer in combination with the primer 
located on the opposite site of the cDNA molecules (Figure 18). 

f . Primer design for the amplification of gene sub- 
populations 

20 The sub-population primers contain two portions (Figure 19). The 

5'-part of the primer is complementary to the sequence of a known 
sequence, extending with its 3'-end into the unknown cDNA sequence. 
Since each nucleotide in the cDNA part of the library can have an 
adenosine, cytidine, guanosine or thymidine residue, 4 different 

25 nucleotides possibilities exist for each nucleotide position. Four different 
amplification primers can be synthesized, each containing the same 
known sequence and extending by one nucleotide into the cDNA area of 
the library. The 4 primers only differ at their most 3'-nucleotide, being 
either A, C, G or T. If we suppose that each nucleotide (A, C, G, T) are 
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equally represented in a stretch of DNA, each one of the 4 amplification 
primers will amplify one quarter of the total genes represented in the 
cDNA library. Extending the amplification primer sequence further and 
increasing the number of amplification primers, the complexity of the 
5 amplification products can be further reduced. Extending the sequence 
by 2 nucleotides requires the synthesis of 1 6 different primers decreasing 
the complexity by 16 fold, 3 nucleotides require 64 different primers and 

i 

nucleotide extension requires n 4 different primers. 

g. PCR amplification 

10 PCR amplification entails mixing template DNA, two appropriate 

oligonucleotide primers (5'- and 3 f -end primers located in the known 
added sequences directed in complementary orientation). Tag or other 
thermostable DNA polymerases, deoxyribonucleoside triphosphates 
(dNTPs), and a buffer. The PCR products are analyzed after cycling on 

15 DNA gels or through analysis on an ABI 377 using the genescan analysis 
software. These analysis methods allow the determination of the 
complexity of the amplified cDNA pool. 

h. Production of a protein expression library 

Each amplified cDNA library sub-population is cloned 5' to 3' in a 
20 bacterial (E. coli, etc.) or eukaryotic (Baculovirus, yeast, mammalian) 
protein expression system. The gene s introduced with its own 
translational initiation signal and a 6xHis tag in all 3 frames. For example: 
The cDNA is restricted with two different, rare cutting restriction enzymes 
(5'-end Bglll and 3'-end Not I) and cloned in the 5' to 3' orientation in the 
25 Baculovirus transfer vector pVL1393 under the direct control of the 
polyhedra promoter. 
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i. Protein expression 
Linearized Baculovirus DNA and recombinant transfer-vector DNA 
are being cotransfected into susceptible SF9 insect cells with calcium 
phosphate. For cotransfection, 10 ug of purified plasmid DNA will be 
5 prepared. An initial recombinant Baculovirus stock will be prepared and 
Sf9 cells are being infected for recombinant protein production. 

j. Protein purification 
The expressed recombinant proteins contain an affinity tag 
(example is a 6xHis tag). They are being purification on Ni-NTA agarose. 
10 Approximately 1 to 2 mg of 6xHis recombinant fusion protein is routinely 
obtained per liter of insect cell culture. 

k. Purification Tag removal 
If the expression vector or the amplification primer was constructed 
with a proteolytic cleavage site for thrombin, the purification tag can be 
15 removed from the recombinant proteins after the protein affinity 
purification step. 

II. Antibody generation by immunization of different animals with 
individual protein mixtures 

3. Preparation of Antibody protein capture reagents 
20 A purified protein preparation translated from a pool of cDNAs is 

injected intramuscularly, intradermal^ or subcutaneously in the presence 
of adjuvant into an animal of the chosen species (rabbit). Booster 
immunizations are started 4 to 8 weeks after the priming immunization 
and continued at 2- to 3-week intervals. The polyclonal antiserum is 
25 being purified using standards known to those skilled in the art. 

The purified antibody batches can be used directly as protein 
capture reagents without modification. In this case the antibody batches 
from different animals have to be kept separate (each batch is one 
capture reagent). 
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III. Antibody proteins are isolated and conjugated with nucleic acid 
sequences which correspond to the original antigen preparation 
resulting in the antibody capture reagents 

Generation of bi-functional capture/sorting molecules for sorting of 
5 the complex protein mixture on a solid phase. 

The glycosylated C H 2 domain of the polyclonal antibodies are 
conjugation to 5' modified oligonucleotides using standard conjugation 
methods. The resulting molecule has one protein capture moiety 
(antibody) and one nucleic acid moiety (oligonucleotide) (Figure 20). 

T 

10 The antibody batches after immunization of an animal with a 

reduced complexity protein pool are conjugated with the one 

oligonucleotide sequence. Antibodies produced from multiple 

immunization events with different protein pools are conjugated to an 

oligonucleotide with a different sequence (Figure 20). 

15 4. Capture of target proteins using reactivity functionality and 

sorting by oligonucleotide hybridization 

Two different methods have been developed for making oligonucle- 
otides bound to a solid support: they can be synthesised in situ, or 
presynthesised and attached to the support. In either case, it is possible 

20 to use the support-bound oligonucleotides in a hybridisation reaction with 
oligonucleotides in the liquid phase to form duplexes; the excess of 
oligonucleotide in solution can then be washed away. 

The support may take the form of particles, for example, glass 
spheres, or magnetic beads. In this case the reactions could be carried 

25 out in tubes, or in the wells of a microtitre plate. Methods for both 
synthesising oligonucleotides and for attaching presynthesised 
oligonucleotides to these materials are known (see, e.g., Stahl et a/. 

■ 

(1988) Nucleic Acids Research 16(7):3025-3039). 

a. Preparation of amine-functionalized solid support 
30 Oligonucleotides of a defined sequence is synthesized on amine- 
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functionalized a glass support. An amine function was attached discrete 
locations on the glass slide using solution of 700 //I of H 2 N(CH 2 ) 3 
Si(OCH 2 CH 3 ) 3 in 10 ml of 95% ethanol at room temperature for 3 hours. 
The treated support is washed once with methanol and then once with 
5 ethyl ether. The support was dried at room temperature and then baked at 
1 10 °C for 15 hours. It was then washed with water, methanol and 
water, and then dried. 

The glass slide was reacted for 30 minutes at room temperature 
with 250 mg (1 millimole) of phthallic anhydride in the presence of 2 ml 

10 of anhydrous pyridine and 61 mg of 4-dimethyl amino pyridine. 

The product was rinsed with methylene dichloride, ethyl alcohol 
and ether, and then dried. The products on the slide were reacted with 
330 mg of dicyclohexylcarbodiimide (DCC) for 30 minutes at room 
temperature. The solution was decanted and replaced with a solution of 

15 117 mg of 6-amino-1-hexanoI in 2 ml of methylene dichloride and then 
left at room temperature for approximately 8 hours. 

b. Oligonucleotide synthesis on a solid support 
The amine-functionalized solid support was prepared for 
oligonucleotide synthesis by treatment with 400 mg of succinic anhydride 

20 and 244 mg of 4-dimethyl aminopyride in 3 ml of anhydrous pyridine for 
18 hours at room temperature. The solid support treated with 2 ml of 
DMF containing 3 millimoles (330 mg) of DCC and 3 millimoles (420 mg) 
of p-nitrophenol at room temperature overnight. The slide was washed 
with DMF, CH 3 CN, CH 2 CI 2 and ethyl ether. A solution of 2 millimoles (234 

25 mg) of H 2 N(CH 2 ) 6 OH in 2 ml of DMF was reacted with slide overnight. 
The product of this reaction was a support, 

-0(CH 2 ) 3 NHCO(CH 2 ) 2 CONH(CH 2 ) 5 CH 2 OH. The slide was washed washed 
with DMF, CH 3 CN, methanol and ethyl ether. 

The functionalized ester resulting from the preparation of the glass 
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support was used for the synthesis of a oligonucleotide sequence. Each 
nucleoside residue was added as a phosphoramidite according to the 
procedure of Caruthers et al. (U.S. Patent No. 4,415,732). 

5. Protein analysis of the captured proteins and complex protein 
5 sample comparison 

The purified antibody batches can be either 1) directly attached to 

a solid surface, and incubated with protein samples, 2) incubated with the 

samples and subsequently bound to a solid support without using the bi- 

functional capture molecule, 3) the bi-functional capture molecule can be 

10 used to capture its corresponding protein in a sample and subsequently 

sort the captured proteins through specific nucleotide hybridization (Figure 

21). 

IV. Antisense oliogonucleotide capture reagents are immobilized in 
discrete and known locations on a solid surface to create an 

15 antibody capture array 

6. Preparation of capture array surface 

5'-aminated oligonucleotides are synthesized using 
phosphoramidate chemistry and attached to N-oxysussinimide esters. 
The attached oligonucleotide sequences are complementary to the sorting 
20 oligonucleotides of the bi-functional antibody molecules (Figure 20). 

Proteins are captured through nucleic acid hybridization of their sorting 
oligonucleotide to the complementary sequence attached to the solid 
surface oligonucleotide. 

V. The antibody capture reagents are added to the total protein 

25 mixture (reactivity step). The reaction mixture is then added to the 

solid surface array under conditions which allow oligonucleotide 
hybridization (sorting step). 

7. Bi-functional reagent/protein capture and sorting 

The bi-functional antibodies are being incubated with the protein 
30 sample under conditions that allow the antibodies to bind to their 

corresponding antigen. The bi-functional antibody molecule with the 
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captured protein is added to the oligonucleotide prepared capture array. 
Under standard DNA annealing conditions which do not denature the 
antigen-antibody binding the bi-functional antibody will hybridize with its 
nucleic acid moiety to the complementary oligonucleotide. 
5 VI. The capture protein is identified using MALDI mass spectrometry 
8. Analysis of the capture proteins 

The attached proteins will be analyzed using standard protein 
analysis methods like Mass Spectrometry. 

Since modifications will be apparent to those of skill in this art, it is 
10 intended that this invention be limited only by the scope of the appended 
claims. 
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WHAT IS CLAIMED IS: 

1 . A gradient array, comprising: 

a 2-dimensional array of moieties X and optionally Y,\and 
a surface Z for presenting moieties X and Y, wherein: 
5 the surface Z comprises a solid support; 

the X and Y moieties are arranged in a two dimensional array 
of continuous gradients of one or more properties of X and Y; 

the properties of moiety X are in a gradient along the X-axis; 
the properties of moiety Y are in a gradient along the Y-axis; 
10 the X moieties are each independently selected to bind to 

biomolecules covalently or with sufficiently high affinity so that the 
resulting complexes of biomolecule/capture compounds are stable under 
conditions of mass spectrometric analysis; and 

the Y moieties are each independently selected to increase 
15 the selectivity of the binding by moiety X. 

2. The gradient array of claim 1, wherein each X moiety in a 
row differs in hydrophilicity, lipophilicity, charge, size or reagent 
specificity. 

3. The gradient array of claim 1 r wherein each Y moiety in each 
20 column differs in hydrophilicity, lipophilicity, charge, size or reagent 

specificity. 

4. The gradient array of claim 1, wherein: 

a Y moiety is present at each locus and is bound to each X moiety 
or is at the same locus of the solid support as each X. 
25 5. The gradient array of claim 1, wherein: 

a Y moiety is present at each locus and is bound to each X moiety 
or is at the same locus of the solid support as each X moiety; and 

each Y moiety in each column differs in hydrophilicity, lipophilicity, 
charge, size or specificity. 
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6. The gradient array of claim 1 , wherein the X moiety 
comprises an azobenzene group and a gradient of hydrophilicity is created 
by increased light exposure. 

7. The gradient array of claim 1, wherein the Y moiety 

5 comprises an azobenzene group and a gradient of hydrophilicity is created 
by increased light exposure. 

8. The gradient array of claim 1, wherein the X moiety 
comprises a charged group and a gradient of charge is created by 
increased exposure to current. 

10 9. The gradient array of claim 1, wherein the Y moiety 

comprises a charged group and a gradient of charge is created by 
increased exposure to current. 

10. The gradient array of claim 1 , wherein the property along X 
axis is reagent specificity for NH 2 , SH, SS or OH group. 

15 11. The gradient array of claim 1, wherein the property along Y 

axis is reagent specificity for NH 2/ SH, SS or OH group. 

1 2. The gradient array of claim 1 , wherein moiety X is selected to 
covalently bind to biomolecules or to bind with sufficiently high affinity so 
that the resulting complexes of biomolecule/capture compounds are stable 

20 under conditions of mass, spectrometric analysis; moiety Y increases the 
selectivity of the binding by X such that the capture compound binds to 
fewer biomlecules when Y is present than in its absence 

13. The gradient array of claim 1, wherein each X is selected 
from the group consisting of an active ester, an active halo moiety, an 

25 amino acid side chain-specific functional group, a reagent that binds to 
active site of an enzyme, a ligand that binds to a receptor, a specific 
peptide that binds to a biomolecule surfaces, a lectin, an antibody, an 
antigen, biotin; streptavidin. 

14. The gradient array of claim 1 3, wherein an X is an a-halo 
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ether, an a-halo carbonyl group, maleimido, a metal complex, an expoxide, 
an isothiocyanate, or an antibody against phosphorylated or glycosylated 
peptides/proteins. 

15. The gradient array of claim 13, wherein X is 

5 -C( = 0)0-Ph-pN0 2 , -C< = 0)0-C 6 F 5 , -C( = O)-O-(N-succinimidyl), -OCH 2 -l, 
-OCH 2 -Br, -OCH 2 -CI, -C(0)CH 2 l, -C(0)CH 2 Br or -C(0)CH 2 CI. 

16. The gradient array of claim 1, wherein loci or X or Y moieties at 
each loci comprise a mass modifying tag. 

17. A method of analysing biomolecules comprising, 

10 a) contacting a composition comprising a biomolecule with an array 

of any of claims 1-16 to form biomolecule complexes with X and/or Y 
moieties at loci on the array ; and 

b) identifying or detecting bound biomolecules. 

18. A collection of capture compounds, comprising a plurality of 
15 capture compounds, wherein each capture compound, comprises a moiety 

Z for presenting moieties X and Y, wherein the moiety Z comprises a solid 
support wherein the X and Y moieties are arranged in a two dimensional 
array of continuous gradients of one or more properties; a plurality of X 
moieties that are selected to bind to biomolecules covalently or with 

20 sufficiently high affinity so that the resulting complexes of 

biomolecule/capture compounds are stable under conditions of mass 
spectrometric analysis and a plurality of Y moieties that increase the 
selectivity of the binding by X such that there is selectivity along the Y 
axis of the two dimentional array. 

25 19. A method for analysis of biomolecules, comprising: 

a) contacting a composition comprising a biomolecule with a 
collection of capture compounds of claim 1 8 to form capture compound- 
biomolecule complexes; and 

b) identifying or detecting bound biomolecules. 
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